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a Clear, condensed summary of the procedure 
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sults as space permits. 

To insure that the Brief Report will be no 
longer than one printed page, its typescript, 


including all matter except the title and the 
author’s lines, must not exceed 85 lines av- 
eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style of 
the 1957 revision of the APA Publication 
Manual. Headings, tables, and references are 
avoided or, if essential, must be counted in 
the 85 lines. Each Brief Report must be ac- 
companied by a footnote in the style below, 
which is typed on a separate sheet and not 
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tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
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mentation Institute. Order Document No. ——, re- 
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economy in duplication. (b) Tables and fig- 
ures should be placed adjacent to the text 
which refers to them. A caption should be 
typed below each figure. (c) Footnotes should 
be typed at the bottom of the page on which 
reference is made to them. In other respects, 
the full report is prepared in the style speci- 
fied by the Publication Manual. 
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SOME FACTORS IN PSYCHOTHERAPISTS’ 
PERCEPTION OF THEIR PATIENTS ' 


NORMAN I 


HARWAY 
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The psychotherapist, during the course of 
his contacts with a patient, attains a degree 
of knowledge of his patient. The understand- 
ing that the therapist develops with regard to 
his patient is not, however, merely a matter 
of extent or quantity of information. Both in 
the internal experience of the therapist and in 
his expressions of his knowledge it is evident 
that different degrees of certainty adhere to 
the varying things that the therapist knows 
about his patient. Some statements about the 
patient are offered with little hesitation or 
doubt, whereas others are formulated tenta- 
tively and expressed with many qualifications. 

There has been much discussion of the ade- 
quacy of clinical judgment. These discussions 
for the most part have been concerned with 
whether statistical or clinical methods lead to 
a greater number of accurate predictions of 
patient behavior (Meehl, 1954). Ability to 
predict is but one criterion of knowledge, how- 
ever. Awareness of what one does or does not 
know regarding a person would also be a re- 
flection of knowledge as would internal con- 
sistency in the statements expressing one’s 
understanding of a person. 

A number of studies in recent years have 
been concerned with the ability of one indi- 
vidual to predict the behavior of another in- 
dividual. In connection with psychotherapy, 
Kelly and Fiske (1951) used the therapist's 
ability to accurately predict patient response 
as one measure of therapeutic competence. 
The therapist’s estimation of the similarity 
between himself and his patient, or of the 
similarity between the therapist’s “ideal” and 
the patient’s self-descriptions, has also been 
suggested as relating to competence (Fiedler, 

1 This investigation was supported in part by a 
grant from the Foundations Fund for Research in 
Psychiatry 


1951; Kelly & Fiske, 1951). Dymond (1953), 
who used the predictive method as a measure 
of empathy, has shown that counselors’ pre- 
dictions of client responses are not primarily 
a function of the counselors’ stereotype of the 
average client. 

The therapist’s statements regarding the 
patient derive from many sources. Among 
those which may be posited are the patient 
himself as a stimulus, some general attribute 
or capacity of the therapist as a perceiver or 
knower of other people, the personal qualities 
of the therapist as reflected in his conflicts, 
desires, attitudes, and needs, and the theo- 
retical schema which the therapist utilizes in 
observing and thinking about the patient. 
These form a complex structure which relates 
to the therapist’s understanding, empathy, 
and reactions to the patient. 

The present study is an attempt to explore 
some aspects of this structure. The questions 
investigated are: (a) Is the therapist aware 
of what he knows and does not know about 
the patient? In terms of the operations here 
involved we may phrase the question: Is the 
therapist’s confidence in his predictions of a 
patient’s responses to a personality question- 
naire related to the accuracy of these predic- 
tions? () Is there any relation between the 
therapist’s knowledge of specific behaviors of 
his patient and his description of the patient 
in more abstract terms? This may be ex- 
amined by a comparison of a personality 
profile derived from individual item predic- 
tions with a profile constructed in terms of 
the therapist’s more global perceptions of the 
patient. (c) Does the therapist’s orientation 
relate to the prediction he makes for the pa- 
tient? In part, the therapist's orientation or 
viewpoint is reflected in the particular goals 
he sets for a patient and in the relative im- 
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portance he assigns these goals. It is this 
somewhat specific feature of the therapist’s 
frame of reference that is tapped in the pres- 
ent instance. (d) To what extent do the 
therapist’s own characteristics influence his 
perception of the patient and the goals that 
he sets for the patient? 

While the problem falls under the general 
rubric of interpersonal perception, the psy- 
chotherapeutic setting was chosen because of 
the intensity of the relationship that de- 
velops and the role assignments in which it 
is the duty of one person—the therapist—to 
know, to understand, to empathize with the 
other. The therapeutic procedure is designed 
in many ways to maximize knowledge of the 
patient. 


Procedure 


Nine psychotherapists participated in the 
present study. The attempt was to obtain a 
reasonably homogeneous therapist group. In 
the current instance the nine were either 
senior psychiatric residents or junior staff 
members in psychiatry, all of whom had from 
two to four years’ training in psychiatry at 
the same institution. While their background 
was in many ways eclectic, the major frame 
of reference might best be categorized as psy- 
choanalytically oriented. All, at the time of 
the study, were treating patients in the out- 
patient department of a university hospital, 
none were fully trained psychoanalysts, nor 
were they using the formal psychoanalytic 
method, and all were working under the su- 
pervision of a tutor with whom they met peri- 
odically and frequently to discuss their psy- 
chotherapeutic activities.’ 

The instrument used was the Edwards Per- 
sonal Preference Schedule (EPPS) (1954). 
This is a 225-item personality questionnaire 
based on the theory of Murray (1938). The 
test is scored in terms of 15 manifest needs: 
achievement, deference, order, exhibition, au- 
tonomy, affiliation, intraception, succorance, 
dominance, abasement, nurturance, change, 


2The writer wishes to express his appreciation to 
the psychiatrists who gave so generously of their 
time and to the patients who participated in taking 
the personality inventory. I am indebted to the par- 
ticipants in the Research Conference of the Depart- 
ment of Psychiatry for their counsel. 
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endurance, heterosexuality, aggression. The 
test is designed to minimize the influence of 
social desirability as a response determiner. 
Statements representative of each need area 
are paired twice with each of the other mani- 
fest needs. For each item (pair of state- 
ments) the subject (S) records which of the 
two statements is more characteristic of what 
he likes or how he feels. A measure of test 
consistency and a measure of profile stability 
can be obtained as well as the primary need 
scores. 

Each of the nine therapists filled out the 
EPPS with the usual instructions, that is, as 
he felt the various items were applicable to 
himself. Shortly after this, the writer met in- 
dividually with each therapist and reviewed 
with him the patients under his care. While 
an attempt was made to select a patient from 
the caseload of each therapist who would be 
homogeneous with respect to such variables as 
age, sex, diagnosis, length of treatment, and 
type of treatment with the patients selected 
for the other therapists, only limited success 
was achieved. The essential characteristics of 
the patients are presented in Table 1. The 
age range is 17 to 35; six are male, three fe- 
male. All patients had been seen by their par- 
ticular therapist for at least 20 hours. 

At least six weeks were permitted to elapse 
before the second step in the study. Arrange- 
ments were made with each therapist and pa- 
tient so that at the end of a specified therapy 
hour the patient immediately went to the psy- 
chology offices where the EPPS was adminis- 
tered. At the same time, back in his office, 
the therapist attempted to answer the EPPS 
as he thought the patient was doing it.* Ad- 
ditional tasks were set for the therapist. As 
he responded to each item as he thought the 

Other writers have made the distinction between 
prediction, where one states in advance what another 
person will do, and postdiction, where one says what 
the other individual has already done though knowl- 
edge of the other’s performance has been withheld 
In this study, then, we are dealing with codiction 
The term codiction will be used when referring to 
our procedure to emphasize that the therapist was 
aware of both the setting in which the patient was 
responding and the immediate mood of the patient 
When referring to the work of others or to the class 
of procedures relating to one person saying what an- 
other has done, is doing, or will do, I will use the 
more usual term—prediction, 
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Table 1 


Desc ription of Patient 


Interviews 


Patient Education* 


* Number of years of formal schooling 


patient was responding, he was asked to indi- 
cate on a five-point scale the degree of con- 
fidence that he had that his statement for the 
particular item would match the patient’s re- 
sponse to that item. When he completed this 
task the therapist took a sheet which had 
listed on it the 15 need areas for which the 
EPPS is scored. Each area was defined as in 
the EPPS manual. The therapist rank ordered 
the 15 areas in terms of his opinion as to their 
dominance for the patient at that time. As- 
suming the validity of the EPPS the thera- 
pist was, in essence, making a crude psycho- 
gram of the patient’s personality. 

After a lapse of at least six weeks the thera- 
pist was again given the EPPS; this time with 
the instruction to complete the form as he 
would like to have his patient complete it 
at the end of the therapeutic process. At the 
same time, the therapist was to rate on a five- 
point scale how important it was to him that 
at the end of treatment his patient complete 
each item in this particular way. 

To recapitulate, for each of the 15 need 
areas we have seven 


scores based on: 


patient’s performance on questionnaire, 


(a) 
(db) 


therapist’s performance on questionnaire, (c) 


therapist’s codiction of patient’s performance, 
(d) therapist’s confidence rating for the ac- 
curacy of each of his codictions, (e) rank or- 
der psychogram by therapist, (f) therapist’s 
statement of what he would eventually want 
patient to be like, (g) importance ratings by 


therapist for each item as he does the im- 
mediately preceding task. 


Therapist 


with 


Diagnosis 


Character trait disturbance 
Adjustment reaction of adolescet 
Character trait disturbance 
Obsessive-compulsive neurosi 
Schizophrenic reaction, simple 
Adjustment reaction of adolescenc« 
Hysterical character disorder 
Conversion reactior 

Adjustment reaction of adole 


Results 


We have treated each patient-therapist pair 
as if they constituted a separate experiment. 
Thus, the nine pairs can be taken as nine 
replications of the same experiment. This en- 
ables each pair to act as its own control. Our 
concern, then, is to compare for each thera- 
pist those areas for which he shows maximum 
knowledge with those for which he shows less 
knowledge. We have then combined the re- 
sults of these nine replications on the assump- 
tion that these might be considered as suc 
cessive samples from a population of similarly 
qualified therapists. The nine therapists rep 
resent nine-tenths of the population of thera- 
pists meeting the description given above who 
were available to us locally. 

The normative sample on which the EPPS 
was standardized consisted of college men and 
women between the ages of 15 to 59, with the 
majority being at the lower end of the age 
range. To test the adequacy of the test for 
our sample, the measures of test consistency 
and profile stability were computed. Test con- 
sistency is measured by repeating 15 items 
in the inventory; the score is the number of 
fashion on both 
Profile stability is evaluated by 
the half the items 
tributing to a need score with the sum of the 


items answered in identical 
occasions 


correlating sum of 


con 
alternate half. The correlation is across the 15 
personality variables for a single S. The data 
are presented in Table 2. Edwards (1954) 
that if an S obtains a consistency 
score of less than nine the validity of that 


suggests 





382 


particular test may be questioned (p. 7). 
Neither the patients nor the therapists, re- 
gardless of the instruction set used, show a 
lack of consistency by this standard. Further, 
the profile stability coefficients compare fa- 
vorably with those reported in the test 
manual. The median correlation reported 
there is approximately .71. It seems proper 
then to use this test with a patient popula- 
tion. It is also reasonable to conclude that 
the various instructional sets did not invali- 
date the use of the questionnaire. 

Assuming the equal likelihood of a positive 
or a negative relationship between the vari- 
ables in which we are interested, we find that 
by the binomial method we would expect all 
nine replications to come out in the same di- 
rection twice in a thousand times and eight 
out of nine to fall in the same direction two 
times in one hundred. However, as we set 
forth no directional hypotheses, a two-tailed 
test seemed appropriate and the respective 
values would then be .004 and .04. Eight out 
of nine in the same direction with a p value 
of .04 is taken as constituting a statistically 
significant finding. 

The therapist’s score for each need area 
when he attempted to codict the patient’s re- 
sponses was compared by means of (Q corre- 
lation with the scores derived from the pa- 
tient’s performance on the EPPS. The nine 
correlations, one for each patient-therapist 
pair, are presented in Table 3, Column 1. It 
will be noted that eight of the nine correla- 
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tions are positive, one falling at the .03 prob- 
ability level. If we compare the rank order 
psychogram as drawn up by the therapist, 
with the relative ordering of the areas as de- 
rived from the patient’s responses (Column 2), 
we find that all nine correlations suggest a 
positive relationship. The negative sign in this 
instance results from a rank of one repre- 
senting the most dominant area and a rank 
of 15 representing the least dominant area in 
the therapist’s ranking. These were correlated 
with the actual scores for the 15 need areas 
based on the patient’s responses where a high 
score represents the relative dominance of an 
area. Here, again, one of the individual cor- 
relations is significant at the .02 level. It is 
possible, however, that one of these sets of 
correlations is a statistical artifact. There is 
a high degree of relationship between the 
therapists’ codictions at the two levels—the 
sum-of-items score and the more global rank 
ordering. Table 3, Column 3, shows that all 
nine coefficients imply a positive relationship, 
and all are significant at the .02 level or be- 
yond. If one of these two levels correlates 
with the patient’s psychogram, then the fact 
that the other does so also is a matter of sta- 
tistics rather than psychology. It is not pos; 
sible to extract from these data which two of 
the three sets of correlations (Columns 1, 2, 
3) are independent. Are the therapists equally 
good codictors at both levels and therefore 
the two levels turn out to be correlated? Or 
does the relationship between codiction and 


Table 2 


Test Consistency and Profile Stability Measures 


Patient Therapist 


Consist Consist 


Patient 


Codiction Goals 


ynsist Consist 


ency 


12 
11 
15 
13 
14 

9 
10 
10 
13 


Median 


Stability 


.96 
73 
84 
74 
88 
92 
95 
9? 


93 


9) 


ency 


12 
14 
10 
14 


Stability 


82 
92 
&9 
91 
91 
78 
SS 
62 


84 


S4 


ncy Stability 


13 87 
11 93 
13 88 
14 93 
12 84 
12 82 
14 84 
12 17 


14 94 


87 


ency 


13 
12 
11 
15 
14 
12 
11 
13 
11 


Stability 


87 
79 
80 
97 
92 
84 
84 
70 
89 


84 
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Table 3 


Intercorrelatio 


Patient 
Rank Order 


Patient Codiction 
Patient Codictior Rank Order 


oc 


-.78* 
R5* 
= 
60* 
83* 
75* 
91* 


7 >* 
42 


76* 


wo 


i ne 


Rw No whd wo w 


os 
um moO oe 


16 


-] 
x 


® Negative « 
*#y < 
I 


} 


relation signifies positive relationship 


05 

patient response exist at only one level, but 
since the therapists are consistent at differ- 
ent levels there is also a positive relationship 
at the second level? 

The size of the correlations, however, sug- 
gests that the individual therapists are not 
highly efficient predictors. Our results are 
comparable to Dymond’s (1953); she reports 
correlations ranging from .05 to .84, with a 
median of .41 for counselor predictions of 
client responses at the end of treatment. 
Column 1, Table 3, shows a range from .00 
to .55, with a median of .36. Reanalyzing our 
data by the more usual procedure of an item- 
by-item rather than need-area score basis and 
assuming that a random response would lead 
to half the items being accurately codicted, 
we find that in all nine instances the thera- 
pist had more accurate than inaccurate codic- 
tions, seven of the nine chi square values are 
significant beyond the .05 level, and the com- 
bined chi square with nine degrees of free- 
dom has a p < .001. 

To analyze another triad which may lead to 
a spurious evaluation of accuracy of codiction: 
there is no consistent relationship between 
the relative strength of patient’s needs and 
those of the therapist (Table 3, Column 4) 
nor is there a consistent relationship between 
the relative strength of the needs for the 
therapist himself and the relative strength 
that he ascribes to them for the patient (Col- 
umn 5). 


Patient Codictior \ 


rherapist ( 


Accuracy 


onsistency 


urac\ 
Phe rapist mfidence ( 
49 44 
38 13 
42 14 
12 15 
24 07 .70* 

30 30 

20 31 


47 


20 
14 
19 
38 


na 
~~ 


— Nm ww 
raanne 


28 


19 


= 2 


To test whether the therapist knew what he 
knew, the confidence he expressed in his co- 
dictions for a need area was compared with 
his accuracy in codicting for that area of the 
patient’s personality. Correlating the sum of 
the confidence ratings for each area with the 
number of accurately codicted items in that 
area we find that eight of the nine correla- 
tions are positive, one being significant beyond 
the .01 level (Table 3, Column 6). It seems 
likely, then, that the therapist’s sense of cer- 
tainty or doubt is related to his actual knowl- 
edge. Here, again, the individual correlations 
are not very high. However, comparing mean 
confidence rating of accurately codicted items 
with the mean confidence rating of the inac- 
curate items, the mean rating of the former 
is greater for all nine therapists, four of the 
nine ¢ tests are significant at the .05 level or 
better, and the combined probability of the 
nine ¢ less than .001. The fact 
greater or lesser consistency between the two 
levels of prediction by the therapist for a 
given area did not seem to relate to the ac- 
curacy in predicting for that area of the pa- 
tient’s personality (Column 7). 

The final task set for the therapist was to 
indicate how he would like the patient to an- 
swer the statements on the EPPS at the com- 
pletion of treatment and how important he 
felt it to be that the patient answer the par- 
ticular response in the particular way. If we 
compare the scores for each need as derived 


tests is of 





384 


from this response set with the importance as- 
signed to the items in the various need areas 
(Table 4, Column 1), eight of the nine cor- 
relations are positive, three being significant 
beyond the .03 level. It would seem, then, that 
those areas in which the therapist would like 
his patient to score high are the areas which 
are most important for the therapist. Neither 
the goal scores nor the importance ratings 
show a consistent pattern over the nine thera- 
pists with the codictions (Columns 2 and 4) 
nor with the rank ordering of the areas by 
the therapist (Columns 3 and 5). In other 
words, they do not seem to have influenced 
the therapist’s predictions for the patient, 
nor his perception of the patient at the time 
of the study. The goal scores, however, are 
related to the therapist’s own responses to the 
questionnaire (Column 6); two correlations 
are significant beyond the .03 level. The in- 
dication here is that the therapist would like 
the patient to become more like the therapist 
as a result of treatment. The importance of 
an arva for the therapist did not relate to his 
accuracy of codicting that area (Column 7). 


Discussion 
It has been suggested that accurate predic- 
tion may be a function of factors other than 
knowledge of the predictee (Cronbach, 1955; 
Gage & Cronbach, 1955). In certain instances 
it may be the result of some general response 
set. The EPPS, in which the social desirabil- 
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ity of the various subscales is controlled, was 
chosen to eliminate the likelihood of a gen- 
eral desirability or social acceptability set de- 
termining codictions. In certain instances pre- 
dictive accuracy may be enhanced by “pro- 
jection.” In terms of the operations involved, 
if the predictor projects—that is, answers for 
the predictee as for himself—then he may be 
accurate not through any knowledge of the 
patient but because the predictee “acciden- 
tally” happens to be like himself. In the pres- 
ent data there is no significant relationship 
between the need patterns of patient and 
therapist nor between the therapist’s own pat- 
tern and the one he codicts for the patient. 
In requiring a therapist to make a set num- 
ber of predictions about his patient to obtain 
an over-all measure of the therapist’s under- 
standing, the assumption is made that equally 
relevant information is available to all pre- 
dictors and that equally relevant information 
is available regarding the various predictions 
to be made. The knowingness of the thera- 
pists, however, resides as much in their aware- 
ness of their relative understanding of differ- 
ent aspects of their patients as well as in their 
ability to codict what their patient will do on 
a personality questionnaire. The positive cor- 
relation between the item codiction profile 
and the rank order profile suggests an in- 
ternal consistency to the therapist’s concep- 
tualization of the patient. The consistency be- 
tween the formulations at the two levels is 
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not related, however, to the accuracy of co- 
diction at either level. 

The therapist’s knowledge or understanding 
of his patient has been treated here as a vari- 
able rather than as a constant process. Those 
studies utilizing predictive accuracy or as- 
sumed similarity (the similarity between the 
therapist’s predictions for the patient and his 
own responses to the task) as measures of 
therapeutic competence, empathic ability, or 
a general tendency to have a warmer feeling 
and liking for one’s patient focus on these 
measures as indices of constant processes 
within the therapist, that is, as being pri- 
marily a general attribute or capacity of the 
therapist (Cronbach, 1955). Our assumption 
has been that understanding is not only a 
function of the patient (that is the stimulus 
object) as well as of the therapist and the 
setting in which he knows the patient, but 
that understanding is variable with respect to 
different aspects of the stimulus object. In 
the present instance, then, we have compared 
different indicators of the therapist’s knowl- 
edge with respect to one stimulus object and 
have investigated the relationship of under- 
standing to the viewpoint or frame of refer- 
ence of the therapist. 

Though we have used the terms viewpoint, 
frame of reference, and orientation somewhat 
interchangeably, the thing we are interested 
in is the theory which the therapist utilizes. 
Our approach here has been indirect and ad- 
mittedly too limited. The problem of investi- 
gating the higher order abstractions of the 
clinician in a manner which will enable direct 
comparison with more concrete statements 
has been a stumbling block (Cartwright & 
French, 1939). The consistency between the 
therapists’ formulations at the two levels ob- 
tained in the present study provides some 
data on this matter. Unfortunately, Murray’s 
system on which the EPPS is based, though 
closely related to psychoanalytic thinking, is 
not identical with the frame of reference of 
our therapists, and many of our therapists 
probably had not yet achieved a clear, con- 
sistent, conscious theoretical position. Our 
findings are also limited by the level of train- 
ing of the therapists. 

The discrepancies that exist between analy- 
ses based on summing the individual items 


into need scores and those based on the items 
themselves (cf. agreement between patient re- 
sponse and codiction and between confidence 
rating and accuracy) also may be considered 
in this context. The therapists showed a high 
degree of accuracy in codicting the individual 
items, yet when the codictions are summed 
by Edwards’ scoring system the relationship 
to the profile based on the patients’ responses 
is not very impressive. Perhaps the items do 
not sum in the same way for our group of 
therapists as they do by the scoring pro- 
cedures. The high correlation between the 
therapists’ rank ordering of the needs as they 
apply to the patients and the profiles derived 
from the individual codictions suggests that 
the therapists when forced to use the 15 
manifest needs are in agreement with the 
scoring system as to which items relate to a 
given need. But these 15 needs are not the 
dimensions underlying the therapists’ codic- 
tions. We may speculate that the accuracy on 
the individual items results from a different 
frame of reference than that on which the 
questionnaire is based. Perhaps the more fun- 
damental question is whether the scored di- 
mensions of the test are the same as the per- 
sonality dimensions of the patients. 

The lack of relation between the goal re- 
sponses or the importance ratings and the 
codictions made by the therapist indicates 
that the therapist’s orientation is not related 
to his knowledge of the patient. While it 
might be expected that the goal-set responses 
would differ from the codictions, that is, that 
the therapist would want the patient to be- 
come something different than he now per- 
ceives him as being, there is no clear negative 
relationship. One might also have expected 
that those areas which were considered more 
important by the therapist would be those of 
which he was most aware and thus those for 
which he could codict most accurately. This 
was not the case. Rather, the goals appear to 
be closely related to what the therapist him- 
self is like. Correlation between the goal-set 
responses and importance scores leads to the 
possible conclusion that therapists are more 
concerned with those attributes they want the 
patient to achieve than with those attributes 
they would like the patient to minimize. To 
combine these last two statements, we may 
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push our conclusions a bit further and say 
that it is more important to the therapist that 
the patient exhibit the same dominant needs 
as he (the therapist) does than that the pa- 
tient minimize certain aspects of his own cur- 
rent functioning. The focus is on what one 
should be rather than on what one should 
not be. 


Summary 


The structure of psychotherapists’ under- 
standing of their patients was explored in a 
study of nine therapist-patient pairs. The as- 
sumption was that therapists have different 
degrees of understanding of the various needs 
of their patients. From the patients’ responses 
to the Edwards Personal Preference Schedule 
and the therapists’ utilization of the schedule 
under a variety of instructional sets, the fol- 
lowing conclusions are suggested: (a) thera- 
pists are aware of what they know and do 
not know about their patients, (6) therapists 
are consistent in their evaluation of their pa- 
tients at a more concrete and at a more ab- 
stract level of formulation, (c) the goals that 
a therapist sets for the patient are related to 
the therapist’s personality structure, (d) in 
setting goals for his patient the therapist is 
more concerned with what the patient should 
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be like than with what he should not be like. 

The conclusions are offered tentatively be- 

cause of the small size and limited nature of 

the sample both with respect to the therapist 
and patient populations and the areas of per- 
sonality investigated. 

Received September 17, 1958. 
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ASSOCIATION TENDENCIES OF GROUPS DIF- 
FERENTIATED ON THE TAYLOR MANI- 
FEST ANXIETY SCALE’ 


E. PHILIP TRAPP ann DONALD H 


Unive 


Theoretical studies on the Taylor Manifest 
Anxiety Scale (MAS) have mainly been con- 
cerned with the relationship of MAS scores to 
drive level. The likely possibility that MAS 
scores may also reflect differences on other 
dimensions, such as differential habit or as- 
sociative tendencies, has been frequently pro- 
pounded (e.g., Child, 1954; Farber, 1955; 
Lazarus, Deese, & Osler, 1952), although 
little empirical evidence has actually been 
adduced. Farber (1955), in a recent review, 
aptly summarizes the untenable status of this 
particular property of the MAS in stating: 
= it is not as yet entirely certain what 
these habitual differences may be, apart from 
the trivial observation that they consist, at 
least in part, in the kinds of verbal responses 
given on the test itself” (p. 324). 

The present study was designed to investi- 
gate the nature of some of the associative 
tendencies presumably related to MAS scores. 
Since the MAS was developed directly from 
items of the MMPI purporting to reflect 
manifest symptoms of the anxiety state, it 
should follow that high scorers on the MAS 
would produce associations with attributes 
similar to those expected in a population of 
clinically anxious subjects (Ss). One com 
monly ascribed attribute of the associations 
of clinically anxious Ss is the relatively high 
proportion of negative tones or implications. 
Therefore, the approach taken in this study 
was to compare the proportionate frequencies 
of negatively toned associations elicited from 

1 This study was supported by a grant from the 
College of Arts and Science Research Fund, Univer 
sity of Arkansas. Edward Mequet assisted in collect 
ing the data 
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high and low scorers on the MAS to some 
neutral stimulus task. The selected stimulus 
task was a list of nonsense syllables. Predict- 
ing that MAS scores would reflect differential 
associative patterns, the hypothesis was stated 
as follows: Ss scoring at the upper extreme on 
the MAS will reflect significantly more nega- 
tively toned associations to a list of nonsense 
syllables than Ss scoring at the lower extreme 
on the MAS 


Method 


Subjects. The Ss consisted of 43 students, 
male and female, in general psychology classes 
at the University of Arkansas. They repre- 
sented the two extremes of scorers on the 
MAS from a group of over 300 students. The 
Ss were tested during regular class periods. 

Procedure. The 320 nonsense syllables in 
Hull's (1933) list were prepared for group 
booklet containing 20 
pages, with 16 syllables to the page, arranged 
in random order. The Ss were informed that 


administration in a 


they were participating in a study of the as 
sociation value of nonsense syllables. They 
were instructed to look at each syllable briefly 
and, if the syllable aroused any particular 
thought or idea, they were to write it down 
on the separate answer sheet. The use of a 
template restricted their attention to one sy] 
lable at a time. 

The MAS was given some two weeks to a 
month later, disguised as a biographical in- 
ventory, by a different E. A short form of the 


inventory, consisting of the 50 anxiety keyed 
items and 40 buffer items, was used. The Ss 


that E 


were informed was collecting bio- 
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graphical information about students at Ar- 
kansas to compare with students elsewhere. 


Results and Discussion 


To simplify the statistical treatment of the 
data and control for possible group differences 
in total number of associations, 25 associa- 
tions were randomly selected from the answer 
sheets of the Ss scoring at the two extremes 
on the MAS. The high scorers (Group HA, 
N of 21 the scale; 
the low scorers (Group LA, N of 22) scored 
6 or below on the scale. The groups consisted 
of approximately the top 10% 
10%, 
of scores. 


scored 28 or above on 


and bottom 
respectively, of the total distribution 
Each set of 25 
then typed on a separate answer sheet and 
coded with respect to group identity. 

Two judges, staff psychologists at the Uni- 
versity of Arkansas, independently evaluated 


associations was 


each association, classifying it as positively 
toned, negatively toned, or neutrally toned. 
A total of 1,055 judgments, consequently, 
were made. Percentage of agreement between 
the two judges was 92. 

A summary of the means and standard de- 
viations of the positively toned and negatively 
toned associations for the two groups is pre- 
sented in Table 1. 

The hypothesis that high scorers on the 
MAS would reflect significantly more nega- 
tive associations than low scorers was con- 
firmed by the ¢ test beyond the .001 level 


Table 1 


Summary Data of Number of Associations to 
Nonsense Syllables 


Negative! Positivel 
Toned 


\ssociations 


Toned 
\ssociation 
Group Mean SD Mean SD 
HA (High Anxious 
N = 21 


6.29 2.19 


LA (Low Anxious 2.91 1.74 
(N 22) 
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(¢= 5.75; p< .001). Incidentally, no sig- 
nificant difference was found between the 
groups with respect to number of positively 
toned associations (tf = 1.09; p > .05). The 
preponderance of associations made by both 
groups was of the neutrally toned variety. 
This would be expected from the nature of 
the task. 

\ check was made on the sex distribution 
of the two groups, and no appreciable differ- 
ence was obtained. Group HA consisted of 13 
males and 8 females, and Group LA consisted 
of 16 males and 6 females. Hence, the differ- 
ence in the obtained associations is unlikely 
an artifact attributable to the sex variable. 

The findings thus support the theoretical 
position that the MAS does differentiate Ss 
on associative factors as well as the more fre- 
quently investigated nonassociative factors. In 
other that MAS 
scores reflect, in addition to the energizing 


words, evidence is offered 
component of a motivational state, a directing 
component, which some writers (e.g., Dollard 
& Miller, 1950; 1955; Hull, 1943; 
Shaffer, 1936) feel is an essential requirement 


Farber, 
of a motivational variable. 


Summary 


The present study was an investigation of 
some of the associative tendencies that may 


be reflected in MAS scores. The response pat- 
terns of 21 high scorers on the MAS and 22 
low scorers were compared in terms of the 


frequencies of negatively aroused associations 
to a list of nonsense syllables. The hypothesis 
that high scores on the MAS would produce 
a significantly greater proportion of negatively 
toned associations was confirmed beyond the 
001 level (¢ = 5.75; p <..001). The groups 
reflected, on the other hand, approximately 
the same proportion of positively toned asso- 
ciations. The sex ratio was nearly the same in 
both 
bility of the sex variable confounding the re- 


groups, thereby eliminating the possi- 
sults. Hence, the most probable explanation 
is that MAS scores do reflect differential as- 
sociative tendencies. 
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THE BLOCK DESIGN TASK: 


RELATIVE PERFORMANCES OF BRAIN-DAMAGED AND 
CONTROL SUBJECTS ' 
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The standard block design procedure re- 
quires a subject (S) to construct block pat- 
terns with the design card as a continual ref- 
erence source. Under these conditions the task 
has been demonstrated to be differentially 
difficult for Ss with brain damage when com- 
pared with various types of control groups 
without brain damage (Aita, Armitage, Reitan, 
& Rabinowitz, 1947; Allen: 1947, 1948; Boyd, 
1949; Goldman, Greenblatt, & Coon, 1946; 
Greenblatt, Goldman, & Coon, 1946; Hécaen 
de Ajuriaguerra & Massonet, 1951; Heilbrun, 
1954; Lidz, Gay, & Tietze, 1942; McFie & 
Piercy, 1952; McFie, Piercy, & Zangwill, 
1950; Milner, 1952; Paterson & Zangwill, 
1944). 

There have been attempts to modify the 
standard block design procedure to increase 
its sensitivity to pathological brain states. 
Grassi (1947, 1953) in his Block Substitution 
Test requires that an S copy a set of blocks 
rather than a drawing. In addition, the test 
includes using different colors from those of 
the model, and S is required to copy all six 
sides of the model. The Block Design Rota- 
tion Test devised by Shapiro (1951, 1952, 
1953) is scored by the amount of rotation of 
the completed block design. Both investigators 
report successful discrimination between brain- 
damaged and control groups, but in neither 
case was there an adequate comparison be- 
tween the revised and standard block design 
procedures. This would seem necessary be- 

1 This investigation was supported by a research 
grant (B-616) from the National Institute of Neu- 
rological Diseases and Blindness of the National 


Institutes of Health, United States Public Health 
Service. 


fore procedural revision could be advanced as 
a diagnostic improvement. 

The present study evaluated a modification 
of the standard block design procedure by re- 
quiring S to construct the designs from mem- 
ory immediately following exposure of the 
stimulus card. The modified procedure was 
based on the hypothesis that a combination 
of two psychological processes, each inde- 
pendently sensitive to brain damage, would 
produce an even more sensitive instrument. 
Since numerous studies (Anderson, 1951; 
Benton, 1953; Cohen, 1952; Collins, 1951; 
Diers & Brown, 1950; Graham & Kendall, 
1946; Halstead, 1947; Heilbrun, in press; 
Hoedemaker & Murray, 1952; Ptacek & 
Young, 1954; Reynell, 1944; Ruesch & 
Moore, 1943) have demonstrated various tests 
of immediate memory to be differentially diffi- 
cult for brain-damaged Ss, the new procedure 
should provide evidence regarding the pro- 
posed hypothesis. 


Method 
Subjects 


The 83 Ss used in this study were patients 
at the University and Veterans Administra- 
tion Hospitals, Iowa City. This number in- 
cluded 40 brain-damaged Ss with confirmed 
cerebral pathology, obtained from neurologi- 
cal or neurosurgical services, and 43 physi- 
cally ill normal Ss who had no history of nor 
displayed any evidence of cerebral dysfunc- 
tion. 

Each diagnostic group was divided into two 
experimental subgroups to be described later, 
all groups being appropriately matched for 
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Table 1 


Age and Education Data for the Brain-Damaged 
and Physically Ill Control Groups 


Age Education 


Mean SD 


Group N Mean 
Brain-damaged 1* 21 
Physically ill 1* 23 
Brain-damaged 2» 19 
Physically ill 2° 20 


36.10 
37.96 
36.79 
34.50 


10.67 
10.26 
10.68 

9.85 


* These groups were directly compared in the study. 
I 


» These groups were directly compared in the study. 


age and education. The data for these age 
and education variables are presented in 
Table 1. None of the differences in means or 
variability approached statistical significance. 

No Ss over 50 years of age were included 
in the study so as to control for the detri- 
mental effects of aging upon this type of in- 
tellectual performance. Also, any S who was 
not able to perform correctly on at least half 
the block design items administered under 
standard conditions was eliminated. This re- 
quirement excluded grossly impaired Ss from 
consideration. 


Task 


Twenty block designs, collected from the 
two forms of the Wechsler-Bellevue Intelli- 
gence Scale and the Kohs designs or con- 
structed by E, were combined into a set 
(Form A). Each design could be reproduced 
by using four Wechsler blocks. A parallel form 
of this task (Form B) was then constructed 
by color reversal, mirror reversal, or rotation 
of the designs. A pilot study using 20 college 
Ss found these two tasks to be of equal diffi- 
culty when administered by an immediate 
memory procedure. Two designs on each form 
were used as demonstration items, leaving 18 
test items to be scored on each. The items in 
Form A were placed in order of increasing 
difficulty as determined by the judgments of 
15 experienced clinical psychologists. Each 
Form B item was automatically placed by 
the judged position of its matched item on 
Form A. 

The standard block design task required the 
S to copy four block designs within a 60-sec. 
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time limit with the design continuously in 
view for reference. 

With the immediate memory block design 
procedure, a design was exposed for 10 sec. 
and then removed. An S was then given four 
blocks with which to construct the design 
from memory within the 60-sec. time limit. 


Scoring 


The scoring for block design performance 
was modeled after and similar to that used on 
the Wechsler-Bellevue. Four points were as- 
signed if a design were constructed without 
errors within the 60-sec. time limit. Further, 
the following speed credits were given when 
a design was accurately reproduced: three 
points, 1-15 sec.; two points, 16—30 sec.; one 
point, 31-45 sec. Accordingly, a maximum 
score of 126 points was possible for the 18 
items. 

Either block design task was discontinued 
after S made errors on five consecutive trials. 


Procedure 


Approximately half of each diagnostic group 
was assigned to two order conditions. In one 
order condition, Ss were given the immediate 
memory procedure first and the standard pro- 
cedure second, while the remaining brain- 
damaged and control Ss were given the pro- 
cedures in the reverse order. In either case, 
the two measures were separated by inter- 
polated psychological tests requiring about a 
half hour to administer. Within both order 
condition, Ss were alternately given the mem- 
ory procedure on Form A and Form B. 


Results 


The hypothesis of differential difficulty for 
brain-damaged Ss on the memory procedure 
was tested by comparing the scores on the 
standard and memory tasks for each S. These 
performance difference scores were then av- 
eraged over Ss in the brain-damaged and 
control groups. The critical test of the hy- 
pothesis was the difference between these av- 
erage group difference scores. 

Table 2 presents the performance compari- 
sons for the two block design procedures 
under the conditions. It can be 
seen that the brain-damaged group showed a 
greater difference in scores for the two ad- 


two order 
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Table 2 


Differences in Mean Scores for Brain-Damaged and Control Ss for Two Block Design 
Procedures Under Two Order Conditions 


Brain-Damaged 


Regular 
Procedure 


Order 
Condition 


Memory 
Procedure 


Memory procedure 


first 98.62 41.34 


Memory procedure 


second 93.26 46.84 


ministration procedures, both when the mem- 
ory test was given first and when it followed 
the standard test. The difference between the 
mean difference scores was significant both 
under the memory-first condition (¢ = 2.34 
for 42 df, p< .02) and under the memory- 
second condition (¢ = 2.24 for 37 df, p< 
.03). These results clearly indicate that the 
immediate nemory block design procedure is 
differentially difficult for brain-damaged Ss as 
compared with control Ss when performance 
on the standard block design procedure is 
used as a baseline. 


Discussion 


The hypothesis that the combination of two 
psychological processes, each independently 
sensitive to cerebral pathology, would provide 
a complex task of increased sensitivity to 
pathological brain states was supported by 
the results of this study. Thus the revision of 
the standard block design procedure, using 
rather simple four-block designs, by adminis- 
tering it as an immediate memory test pro- 
vided a relatively more difficult task for brain- 
damaged Ss than controls when compared 
with performances of these groups on the 
standard task. This relationship held whether 
the memory procedure was presented first and 
the standard procedure second or whether this 
order of presentation was reversed. Certainly 
one implication of these findings would be 
that further research direct itself towards 
evaluation of more complex processes as in- 
dicators of cerebral pathology. 

The question could be raised as to how the 
block design memory task compares in dis- 
criminative efficiency to the standard task 


Difference 


Control 


Memory 
Procedure 


Regular 


Procedure Difference 


109.56 65.65 43.91 


101.95 68.55 


33.40 


used in the present study and to a more diffi- 
cult block design task using standard pro- 
cedures. 

An analysis of the overlap data for the 
memory task in the current study found that 
under the memory-first condition, the opti- 
mum cutting score provided correct classifi- 
cation for 14 of 21 brain-damaged Ss and 18 
of 23 control Ss, an average efficiency of 
about 73%. Under the memory-second con- 
dition, an optimum cutting point on the mem- 
ory task led to correct classification for 12 of 
19 brain-damaged Ss and 15 of 20 control Ss, 
with an average efficiency of about 69%. 

In comparison to these average discrimina- 
tive efficiency figures of about 73% and 69% 
for the memory task, the standard task used 
in the present study maximally identified 
about 70% and 65% under the memory-first 
and memory-second conditions. These stand- 
ard procedure figures are based upon correct 
classification of 12 out of 21 brain-damaged 
Ss and 19 of 23 control Ss (memory-first con- 
dition) and 16 of 19 brain-damaged Ss and 9 
of 20 controls (memory-second condition). 

The results of a previous study by Heil- 
brun (1954) provide a basis for comparing 
the memory task with a more difficult stand- 
ard block design task. In the earlier study, 
100 Ss (71 brain-damaged Ss, 29 physically 
ill control Ss), selected from the same hos- 
pitals as the Ss in the present study, were 
given the block design subtest from the 
Wechsler-Bellevue under standard conditions 
as part of a larger test battery. It was found 
that this block design test provided better 
discrimination between brain-damaged Ss and 
controls than any of the other 14 tests in the 
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battery, accurately classifying an average of 
about 72% of the Ss in these two groups 
after selection of an optimum cutoff score. 
The diagnostic comparison between the im- 
mediate memory block design procedure and 
the standard Wechsler block design test does 
not suggest that the former provides a more 
sensitive task for the detection of brain-dam- 
aged Ss. Further, the comparison between the 
memory procedure and the standard task uti- 
lized in the present study does not favor the 
memory procedure as a discriminative instru- 
ment in any clear-cut manner. However, the 
scoring system used in this study was chosen 
because it seemed to best allow for a direct 
comparison between the current standard pro- 
cedure with block designs and a modified pro- 
cedure. It seemed of interest to try at least 
one other scoring system to better judge 
whether the obtained levels of discriminative 
efficiency represented actual upper limits of 
the memory task for these samples of Ss or 
artificial limits imposed by the mode of scor- 
ing. It was decided to score by the number of 
individual blocks (i.e., there are four indi- 
vidual blocks per design) correctly placed 


within the time limit with no time bonus 


scores. This more precise scoring for accuracy 


of performance, again using an optimum cut- 
ting score, correctly classified 17 of 21 brain- 
damaged Ss and 17 of 23 control Ss under 
the memory-first condition, an average cor- 
rect placement of about 78%. Under the 
memory-second condition, optimum cutting 
led to correct classification for 13 of 19 brain- 
damaged Ss and 16 of 20 controls, an average 
efficiency of about 74%. Using this more pre- 
cise accuracy score, the maximum efficiency 
for the standard procedure was approximately 
65% for the memory-first condition (10 of 
21 brain-damaged Ss and 19 of 23 controls 
correctly identified) and about 62% for the 
memory-second condition (14 of 19 brain- 
damaged Ss and 10 of 20 controls correctly 
identified). These findings, though based on 
small samples, give more support to the no- 
tion that the present memory procedure com- 
pares favorably to either the standard pro- 
cedure using simple four-block designs or the 
standardly administered and scored Wechsler 
task relative to the prediction of cerebral pa- 
thology. 
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It should be noted that using this more pre- 
cise error scoring system, significant differ- 
ences between brain-damaged and control Ss 
in their relative performances under the two 
administrative procedures were demonstrable. 
Under the memory-first condition the ¢ value 
was 3.51 for 42 df (p < .001), while under 
the memory-second condition the ¢ value was 
2.64 for 37 df (p < .01). In both compari- 
sohs, the brain-damaged showed the greater 
relative impairment on the memory task. 

The finding that discriminations between 
brain-damaged and control Ss was enhanced 
when bonus scores were mot assigned for 
speed of performance (in addition to the more 
precise accuracy scoring) raised the interest- 
ing possibility that speed of performance did 
not show the differential decline when the 
memory block design procedure was com- 
pared with the standard procedure. Analysis 
of performance speed showed that brain- 
damaged Ss were consistently slower than con- 
trols but that the speed differences between 
these diagnostic groups decreased under the 
memory procedure, though the trend did not 
reach statistical significance. This finding 
stands in contrast to the finding that brain- 
damaged Ss show a reliable increase in im- 
pairment with the memory procedure when 
only performance accuracy is considered. 

It should be emphasized that the findings 
of the present study do not warrant utilizing 
this particular memory task as an individual 
diagnostic measure. This study was primarily 
concerned with the effects of combining block 
design and immediate memory procedures 
upon the performance of Ss with cerebral 
damage and not with the construction of a 
diagnostic test. Considerably more research 
would be necessary before any such practical 
application would be warranted 


Summary 

1. This study investigated the hypothesis 
that the combination of two psychological 
processes, each independently sensitive to the 
effects of cerebral damage, would provide a 
complex task of increased sensitivity to cere- 
bral pathology. 

2. Performances of Ss with confirmed cere- 
bral pathology and matched physically ill Ss 
showing no history or current evidence of 
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cerebral damage were compared on two block 
design procedures—the standard copying pro- 
cedure and an immediate memory procedure. 

3. The prediction of differential difficulty 
on the memory task for brain-damaged Ss was 
supported by the data. 

4. Consideration of the overlap data in 
this study and the relevant data from another 
study suggested the relative diagnostic use- 
fulness of a block design memory procedure 
when a more precise evaluation of perform- 
ance accuracy is utilized. 

5. Comparison of the results obtained when 
accuracy and speed of performance were ana- 
lyzed separately suggested that brain-dam- 
aged Ss show differential impairment in ac- 
curacy of performance on the memory task 
but do not show differential impairment in 
speed of performance. 


Received August 4, 1958. 
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Most investigations of personality factors 
in crime and delinquency have begun with a 
legally defined sample of offenders, proceeded 
with various comparisons between that group 
and a more or less carefully matched group 
of nonoffenders, and ended with ambiguous 
results. After reviewing 113 such compari- 
sons, Scheussler and Cressey (1950) stated 
that “The doubtful validity of many of the 
obtained differences, as well as the lack of 
consistency in the combined results, makes it 
impossible to conclude from these data that 
criminality and personality elements are as- 
sociated” (p. 476). Like all negative findings, 


however, these can be interpreted in at least 


two ways, viz., as evidence for essential 
identity between offenders and nonoffenders 
in respect to personality, or as consequences 
of methodological failure. We believe the lat- 
ter interpretation to be more plausible than 
the first, and submit that the most glaring 
defects in many previous investigations lie in 
the gross behavioral heterogeneity of legal 
offenders and inadequacies in the instruments 
used to examine them. The first condition is 
likely to lead to serious attenuation of all ob- 
served relationships, and to eventuate in a 
most unimpressive array of low correlations 
and small differences. The second condition 
is likely to obscure true relationships beyond 
all recognition. 

As correctives, we propose that greater care 
be exercised in defining and measuring cer- 
tain personality traits of empirically demon- 

1 Parts of this research were supported by 2 grant 
from the Institute for Research and Training in the 
Social Sciences, Vanderbilt University 

2 Now at Vanderbilt University 


strated importance in the predisposition to 
delinquent activity, and that subsequent re- 
search be concentrated upon the origins and 
consequences of these tendencies, rather than 
in the direct but premature search for the 
causes and outcomes of delinquency itself 
The response-based inference of constructs 
which relate to delinquency, the development 
of progressively more adequate measures of 
these constructs, and the patient delay of the 
search for causal antecedents until such steps 
are taken, is slower and more restrictive than 
most of the earlier approaches, but it promises 
a degree of explanatory depth and precision 
which the older methods can never afford 
The present study is an attempt to defin« 
such constructs through the factor analysis of 
two sets of questionnaire items previousl) 
shown to differentiate between delinquents and 
nondelinquents (Gough & Peterson, 1954 
Quay & Peterson, 1958). The initial mode of 
item selection has guaranteed that these re 
sponse tendencies have something to do with 
delinquency and, in fact, one of the item sets 
comprising the Socialization Scale of the Cali 
fornia Psychological Inventory 
a degree of construct validity 


has displayed 
truly unusual 


among personality tests (Gough, 1957; Gough 
& Peterson, 1954; Peterson, Quay, & Ander 
son, 1959). Inspection of the items, however, 
reveals a diversity of content and meaning 
nearly as great as for delinquency itself, and 
this analysis was undertaken in the belief 
that factor analysis could lead to the educ- 
tion of a set of reasonably unitary, independ 
ent, possibly meaningful constructs that would 
offer more powerful hypotheses for further re- 
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search than is provided by the original more 
heterogeneous measures. 


Subjects and Procedure 


In an effort to maximize variance along 
dimensions of primary concern, both delin- 
quents (NV = 116) and nondelinquents (NV = 
115) were included in the sample. To mini- 
mize variance along ‘at least one irrelevant 
dimension, the analysis was restricted to data 
from white subjects (Ss). Delinquents and 
nondelinquents had been matched in respect 
to age and place of residence for the purposes 
of another study (Peterson et al., 1959), the 
report of which contains more information 
about the selection of Ss than this one can. 

All Ss were given a combined form of two 
previously developed “delinquency” scales 
(Gough & Peterson, 1954; Quay & Peterson, 
1958), phi coefficients were obtained for all 
item pairs, and 15 factors extracted by the 
complete centroid method. Judgments based 
on Tucker’s function and contributions to 
total variance led to the exclusion of 10 of 
the factors. The remaining five provide a 
highly efficient, if not exhaustively sufficient, 
accommodation of the relationships in the phi 
matrix. 
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Rotations were pursued in two ways. First, 
an orthogonal analytic solution was reached 
by means of the quartimax routine for the 
electronic computer ( Neuhaus & Wrigley, 
1954). This yielded an acceptable but not 
strikingly impressive solution, in terms of the 
usual criteria of simple structure. All hyper- 
planes were significant at the .05 level by 
Bargmann’s (1954) test, but visual inspec- 
tion of the plots suggested that the solution 
might be improved somewhat by further 
shifts. The centroid factors were then rotated 
to an oblique analytic solution by means of 
the oblimax routine for the computer, and 
this was followed by a series of visually di- 
rected rotations until a virtually unimprov- 
able solution had been found. All hyperplanes 
defined by these latter operations were sig- 
nificant far beyond the .001 level. Intercor- 
relation of loadings, however, demonstrated 
that the two independent solutions were very 
closely similar (r’s for analogous factors 
ranged from .89 to .99), and the simplicity 
and objectivity of the orthogonal quartimax 
solution led to its choice for presentation 
here. The factors given below are utterly free 
of contamination by interpretative bias, offer 
the simplicity of an orthogonal space, and 


Table 1 


Factor 1: Psychopathy 


Loading 


64 The only way to settle anything is to lick 
the guy. 
62 Winning a fight is more fun than any 
thing. 
62 The people that run things are usually 
against me 
61 Cops usually treat you dirty. 
If you don’t have enough to live on, it's 
OK to steal. 
\ lot of times it’s fun to be in jail 
The only way to make big money is to 
steal it. 
A person is better off if he doesn’t trust 
anyone 
If the cops don’t like you, they will get 
you for anything. 
Life usually hands me a pretty raw deal 
Cops and judges will tell you one thing 
and do another. 
A guy like me hits first and asks ques 
tions later 


Loading 


43 +It’s dumb to trust older people 
43 I would do almost anything on a dare 
43 If somebody does something to me, I 
always get them back. 
Most brothers and sisters are 
trouble than they are worth 
I don’t mind lying if I am in bad trouble. 
I go out of my way to meet trouble 
rather than try to escape it 
I do what I want to, whether anybody 
likes it or not 


more 


I would rather be at home when things 
go wrong 

I got (or used to get) into a lot of fights 
in school. 

I never cared much for school 

I have never done any heavy drinking. 

I have run away from home because my 
folks treated me bad. 

I'm really too tough a guy to get along 
with most kids 
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Table 2 


Factor 2: Neuroticism 


Loading 


68 I often feel that I am not getting any 
where in life 

56 Sometimes I used to feel that I 
like to leave home 


would 


I seem to do things that I regret more 
often than other people do 

I have often gone against my parent’s 
wishes 

My parents often disapproved of my 
friends 

I sometimes wanted to run away from 
home 

I often feel as though I have done some 
thing wrong or wicked 

I don’t think I’m quite 
others seem to be 


as happy as 


People often talk about me behind my 
back 
With things going as they are, it’s pretty 


hard to keep up hope of amounting to 
something 
I would 


thar 


rather go without something 


ask for a favor 

I sometimes feel that I made the wrong 
choice in my occupation 

I have very strong likes and dislikes 

on the 


I often act spur of the 


without stopping to think 


moment 


are, for all practical purposes, equivalent to 
an oblique solution which is unassailable on 
the grounds of simple structure. 
Results 

The combined questionnaire as it was 
actually administered, the centroid factor ma- 
trix, and the rotated factor matrix are re- 
ported elsewhere.* The factors will be defined 
here, in descending variance, by 
statement of the items which have loadings 


order of 


of .30 or more. A positive sign for any load- 
ing implies that the response “true” is asso- 
ciated with the positive pole of the factor; a 
negative sign indicates that a “false” response 
is associated with the positive pole of the fac- 
tor. Item numbers refer to the variables as 
they appeared in the questionnaire as used 

The combined questionnaire, centroid factor ma 
trix, and rotated factor matrix have been deposited 
with the American Documentation Institute. Order 
Document No. 5920, microfilm 
or $1.25 for photocopies 


remitting $1.25 for 


No. Loading 


+1 I get nervous when I have to ask some 
one tor a job 
feel (or used to feel 


Sometimes I that 


if I could just get away from home, 
everything would be all right 

Cops and judges will tell you one thing 
and do another 

In school I 
principal for cutting up 


My folks usually blame bac 


was sometimes sent to the 
company 
for the trouble I get into 
Most of the If 


I have more 


‘ } 
Lime eel happy 


than my share of things to 
worry a oul 

It is hard for me to a ral when | 
am with new peopl 

It isn’t their fault that 
into troubl 

My folks have sometimes been in trouble 
with the law 

When I was a little kid, I was always 
doing things my folks told me not to 

I used to steal sometimes when I was a 
youngster. 

I have never been in trouble with the 


law 


herein and as deposited with the American 
Documentation Institute. 

Tough, amoral, rebellious qualities are ob- 
viously implied by the given in 
Table 1, and these meanings, together with 
those of impulsiveness, a conspicuous distrust 
of legal and other authority, and an apparent 
freedom from family ties, led to our desig 
nation. Factor meaning seems very close to 
that of “Unsocialized Aggression,” as identi 
fied in the work of Jenkins and his colleagues 
(Hewitt & Jenkins, 1946; Jenkins & Glick 
man, 1947), and strongly involves the same 


variables 


disregard for public opinion so prominent 
in the “Psychopathic Personality” dimension 
which emerged from Comrey’s (1958) factor 


(Pd) 


analysis of the Psychopathic Deviate 
scale of the MMPI. 

Tendencies toward impulsive action again 
appear in Factor 2, but the intrapsychic con- 
comitants, and perhaps the dynamic bases, of 


the behavior are decidedly not psychopathic 





D. R. Peterson, H. C. Quay, and G. R. Cameron 


Table 3 
Factor 3: Family Dissension 


Loading 


61 My mother and father argue a lot 
60 My step-father (or step-mother) treats 
me badly. 
49 My home life was always very pleasant. 
49 I was often punished unfairly as a child. 
46 The members of my family were always 
very close to each other. 
41. My folks yell at us kids a lot. 
39 My mother and father have never really 
been friends of mine. 
My home life was always happy 
My home life as a child was less peaceful 
than those of most other people. 
I have run away from home because my 
folks treated me bad 
I have lived in an orphans’ home or a 
foster home at some time 


Remorse, tension, guilt, depression, discour- 
agement—.in short, neurotic responses—covar) 
with antisocial activity. The factor bears 
strong resemblance to a factor labeled “LDis- 
turbed Delinquency” by Jenkins and Glick- 
man (1947), and to the “Neuroticism” factor 
isolated by Comrey (1958). 

This is clearly a family background factor. 
The items are so uniform in meaning that no 
further discussion is necessary. 

Unlike the previous dimensions, Factor 4 is 
difficult to interpret. It was principally the 
pervasive sense of incompetence and failure 


Table 4 


Factor 4: Inadequacy 


Loading 


43 I have never n in trouble with the 
law 

41 I am behind at least a year in school 

I'd quit se hool now if they would let me 

When I was going to school I played 
hooky quite often 

When something goes wrong I usually 
blame myself rather than the other 
fellow. 

I hardly ever get excited or thrilled 

I enjoy work as much as play 

My folks move (or used to move 
place to place a lot. 

I would have been more successful if 
people had given me a fair chance 


from 


which suggested “Inadequacy” as an inter- 
pretative label, and the general impression is 
indeed that of an inability to cope with the 
problems of a complex world. The meaning of 
the factor, however, is obscure and our in- 
terpretation tentative. 

This factor is no easier to interpret than 
Factor 4. Only three of the items, however, 
attained very high loadings, and these all 
stand in some relationship to a history of con- 
flict with school authority. The title we have 
tentatively attached implies such a history. 


Table 5 


5: Scholastic Maladjust ment 


When I was going to school I played 
hooky quite often 

In school I was sometimes sent 
principal for cutting up. 

\s a youngster in school I used to give 
the teachers lots of trouble 
keep out of trouble at all costs 
don’t mind lying if I am in bad troubk 
never cared much for school. 


to the 


used to steal sometimes when I was a 
youngster. 


think I am stricter about right and 


wrong than most people. 

have used alcohol excessively 

have often gone against my parents’ 
wishes. 


Summary 


Research on personality factors in delin- 
quency has long been impeded by the gross 
behavioral heterogeneity of delinquents, and 
various inadequacies in the measures used to 
study them. A factor analysis of two ques- 
tionnaire scales of demonstrated effectiveness 
in differentiating delinquents from nondelin- 
quents was conducted in the belief that fu- 
ture research on the origins and consequences 
of mediating personality tendencies thus de- 
fined would lead to greater scientific progress 
than direct investigation of legally defined de- 
linquency itself. Three personality dimensions 
and two background factors emerged. The 
first was characterized by a number of psy- 
chopathic qualities and was named accord- 
ingly. In the second factor, impulsive anti- 
social behavior covaried with expressions of 
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regret, depression, and other negative affect. 
It was interpreted as a neurotic dimension. 
The third putative personality factor implied 
a general sense of incompetence and was re- 
garded as an expression of inadequacy. Of 
the two background factors, one clearly re- 
lated to family dissension, and the other 
seemed, much less clearly, to relate to a his- 
tory of difficulty in school. 
Received August 7, 1958 
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THE TWISTED PEAR AND THE PREDICTION 
OF BEHAVIOR '*’ 


JEROME FISHER 
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“To say that nature should conform to a 
Gaussian distribution is asking too much. Who 
is there to tell nature what the statisticians 
would like?” This statement, made recently 
by Boring (1957) in an editorial context, 
seems an appropriate introduction to the 
question and the point of view presented in 
this paper. It is by no means a new point of 
view. Rather, it is a variation on the familiar 
theme that organisms, whose behaviors are 
variously studied for purposes of prediction, 
do not conform to the assumed mathematical 
conditions which are often assigned to them 
in statistical manipulations of data. 

In substance, the paper raises questions not 
only about the appropriateness of our sta- 
tistical assumptions as they concern predic- 
tion problems but, also, it ventures the propo- 
sition that, because of certain biologic and 
psychologic variants, organismic behaviors are 
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1 Based on a paper read at the XV International 
Congress of Psychology, Brussels, July, 1957. 

2For many helpful sugestions and for their en 
couragement the author is especially grateful to 
Robert C. Tryon and Harrison Gough 


predictable in only one segment of a pre- 
dictor-criterion relationship.* 

Several years ago, in analyzing the results 
of a cross-validation study (Fisher, Gonda, & 
Little, 1955) involving the Rorschach, my 
colleagues and I noted a curious but consist- 
ent result. When the Rorschach yielded a 
score indicating the presence of brain disease, 
the agreement with independent criterion 
judgments of brain pathology was extraordi- 
narily good (94%). A low score, however, 
was not accurate in predicting the absence of 
brain pathology. A Pearson validity coefficient 
was computed and found to be a respectable, 
significant, but humble .32. 

Next, we subjected several standard neu- 
rologic procedures, such as the EEG, the 
lumbar puncture, etc., to essentially the same 
validation analysis (Fisher & Gonda, 1955). 
Except for a few minor variations, five neu- 
rologic diagnostic techniques gave the same 
results: High accuracy in predicting pathol- 
ogy from positive test findings, but like the 
Rorschach, poor accuracy in predicting the 
absence of pathology from negative diag- 
nostic signs. The over-all validity coefficients 
ranged from .13 to .32. 

When scattergrams were plotted of the re- 
lationship between predictor scores and the 
criterion, a nonlinear heteroscedastic configu- 
ration was revealed which looked like a twisted 
pear; it is approximated in Fig. 1. This find- 
ing raised the question: Is the twisted pear 
unique to these_data or is it a general pat- 
tern, characteristic of prediction problems? 

Cronbach and Gleser (1957) have recently pre 
sented a similar argument to the one advanced in 
this paper, based on the mathematics and logic of 


decision theory, for considering the problem of dif- 
ferential validities in psychological testing 
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The diagram will call to mind many fa- 
miliar observations and findings. When scat- 
tergrams from several sources are examined, 
where more or less known criteria are plotted 
against more or less standard predictors, e.g., 
objective tests, a linear relationship with rela- 
tively little array variance will be observable 
at one end of the plot. The relationship, how- 
ever, becomes increasingly nonlinear and in- 
creasingly variable as it approaches the mid- 
dle and upper extreme. The terms “adaptive” 
and “nonadaptive” are used in Fig. 1 to sug- 
gest an implicit but functionalistic concep- 
tion of the corresponding relationships be- 
tween predictor “good” and “poor” scores 
and the criterion behavior of organisms. In 
other words, the predictor becomes decreas- 
ingly predictive of the criterion as the scores 
obtained increase from the “poor” to “good’’ 
extremes of the predictor. To illustrate this 
predictor-criterion differential, the findings of 
studies of intelligence, learning, personality, 
and pathology will be presented and discussed. 

According to Fig. 1, the predictive efficiency 
of the IQ at the “poor” extreme should be 
considerably better than for average and 


“good” (superior) IQs. Specifically, with IQs 
below, say 50, it is highly predictable that 


the individual will require custodial or com- 
plete protective care, and that he will not ac- 
quire any scholastic skills. With IQs of about 
50-70, there appears to be moderately good 
predictability that the individual will require 
special training and guidance, particularly at 
work and at school. With IQs of approxi- 
mately 70-85, and, to a greater degree be- 
tween 85-110, however, vocational and scho- 
lastic limitations vs. successes become increas- 
ingly difficult to predict. 

Table 1 summarizes the results of several 
longitudinal studies of the vocational and edu- 
cational achievements of male mental defec- 
tives, their controls, and, in addition, the 
vocational and scholastic attainments of Ter- 
man’s gifted men (Terman & Oden, 1947). 
The latter included some 730 subjects whose 
childhood IQs were 140 or higher and who 
were Classified into two groups by raters on 
the basis of vocational achievement 18 years 
later. The A group (mean childhood IQ 
155) was the most successful and the C group 
(mean IQ = 150) the least successful. The C 
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group of 150 men, therefore, was, by defi- 
nition and rating procedures, an underachiev- 
ing group. Table 1 reveals that as a group, 
Terman’s gifted men, 25 years of age and 
older in 1940, were superior occupationally 
to the general male population in the U. S. 
Further analysis of the data (not reported in 
Table 1), however, discloses that 28% of the 
C group fell at or below the median occu- 
pational level of the employed males in the 
U. S., as well as of those in California in 
1940.*'° Terman and Oden (1947) note that: 


On the average, those of highest IQ accomplish more 
and are equally well-adjusted, but one cannot any 
where draw an arbitrary IQ line that will set off 
potential genius from relative mediocrity. Some of 
our subjects who have achieved most notably did 
not, either in childhood or in adult life, rate above 
the average of the total group in tested intelligence 


With regard to the lower end of the IQ 
range, Table 1 also gives the occupational 
achievements of two follow-up studies of 
mentally defective groups. The data reported 
are in the form of average percentages for 
the combined groups. Compared to the gifted 
and control groups, the findings suggest a 
stronger, less variable relationship between 
mental deficiency, as determined mainly by 
IQ, and occupational schievement. None of 
the mental defective groups achieve I and II 
occupational categories; 23% are in III and 
IV, and 77% in V and VI. For the control 
samples, whose mean IQs are within the av- 
erage range, the proportion of cases fall in 
the several shown. 


categories as Approxi- 


*See p. 361 ff. in Terman and Oden (1947) 

° The unemployed and incapacitated (N = 16) in 
Terman’s gifted group were not included in the num 
bers and percentages given in 
those who were classified as students were also 
omitted. A mitigating consideration is the postde 
pression economy of the period and the relative em 
ployment “youthfulness” of the entire gifted group 
Many were still in transit occupationally 

The same conditions, however, obtain for the men 
tally defective and their control groups, because all 
of the studies cited in Table 1 were based upon the 
socioeconomic conditions of the decade 1930-1940 
In personal communication with the author, Melita 
H. Oden reported that since 1940, follow-up studies 
of the gifted group have shown intragroup shifts on 
the occupational scale; some C men have moved up 
and others have moved down on the scale; some A 
men have taken positions in lower (III-VI) occupa 
tions 


Table 1; in addition, 
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mately two-thirds of the gifted group at- 
tained superior vocational levels (I and II). 
The presence of 10 and 24% of the gifted in 
the lower occupational Categories V and VI, 
and III and IV, respectively, however, sug- 
gests again the greater variability and, hence, 
the potential predictive “error” associated 
with “good” IQ scores. 

Prediction of scholastic achievement by 
means of the IQ was studied by review of the 
same investigations. Table 1 also presents the 
follow-up findings of the same group of men 
whose IQs were obtained in childhood. The 
data are summarized in scatterplot form via 
averages to make them comparable to Wolfle’s 
(1951) categories of educational achievement. 

It appears that the IQ has considerable 
power in predicting scholastic achievement, 
particularly in the tails of the distribution of 
IQs. Compared to the controls and the gifted, 
however, the lower end of the IQ range (the 
mental defectives), reveals greater certainty 
of prediction for the 
achievement. 

At the upper extreme, the giftéd’s superi- 
ority educationally is self-evident (67% 
graduated from college). According to Ter- 
man and Oden, none of the gifted failed to 
complete grade school. For the middle and 
upper range of the predictor, however, the 
computed percentages, placed in their respec- 
tive cells, suggest the characteristic variability 
observed before and again reveal the twisted 
pear shape. 

What about learning, personality, and the 
“twisted pear’ phenomenon? In his genetic 
research of dull-bright rats and maze-learning 
ability, Tryon (1940) found that while the 
poor maze-learning of the dull rats could be 
predicted very well, the bright rats varied 
throughout a wide range of learning scores 
(errors). Table 2 presents a fourfold table 
which was obtained by plotting around the 
median value of the total errors of the sev- 
enth generation (N = 153), the total errors 
made by their dull and bright offspring of 
the 15th to 18th generation. The reason for 
using the seventh generation as the source for 
the median value comes from Tryon’s ob- 
servation, “There appears to be a law of di- 
minishing returns, for after the F; negligible 
effects of selective breeding are noted.”’ None 


criterion: scholastic 


Table 2 
Percentage of Bright and Dull Strains 15th-18th 
Generations, Who Fall Above and Below the 
Median Total Errors of the 7th 
Generation (NV = 153 
(From Tryon, 1940 


Brights 
(N = 332 


Predictor 


39 


Total Errors 
Criterion 


of the descendants of the dulls fell in the 
bright category, whereas 30% of the descend- 
ants of the bright ancestors performed in the 
dull category. The results, in short, suggest 
that the relation between predictor ancestors 
and criterion descendants, involving the learn- 
ing of a complex set of highly integrated acts, 
is that of the twisted pear. 

The work of the California-Berkeley group 
on authoritarianism suggests a similar pattern 
of differential predictability in personality as- 
sessment (Adorno, Frenkel-Brunswik, Levin- 
son, & Sanford, 1950). While high scorers 
(authoritarians) and low scorers (nonauthori- 
tarians) result of statistical 
analysis they consist in accumulations of 
symptoms frequently found together but they 
leave plenty of room for variations of specific 
features. Furthermore, various distinct sub- 
types are found within each of the two major 
portions.” Analysis of the prejudiced subjects, 
however, revealed that they are “on the whole 
more alike as a group than are the unpreju- 
diced. The latter include a great variety of 
personalities (pp. 971-972). It seems, 
therefore, that a high score on the ethnocen- 


“emerge as a 


trism scale is more predictive of authoritari 


anism than are low scores of its absence. 
Leaving psychology for a moment, an ana- 
logue seems to exist in the actuarial determi- 
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nations of the life insurance field (see Fig. 1). 
For example, having reviewed their experi- 
ence tables and having ascertained the extent 
of their errors of prediction, insurance com- 
panies confidently relate obesity to a short 
span of life and set their premium rates ac- 
cordingly. As weight approaches nonobesity, 
however, there appears to be increasing vari- 
ability in predicting life expectancy. It is, of 
course, true that no one dies of obesity and 
the nonobese may die of many other causes; 
yet as a predictor variable of the probability 
of life expectancy, insurance companies re- 
spect the highly significant relationship found 
at the “poor’-nonadaptive extreme, i.e., be- 
tween obesity and short life expectancy. 

In their daily practice, my hospital medical 
colleagues confirm the curvilinear relationship 
between their tests and their criteria of pa- 
thology; at least, they acknowledge implicitly 
the application of differential predictability. 
For example, it seems that medical special- 
ists almost always regard a positive test find- 
ing (“poor” predictor score) with consider- 
able respect. The reason for this is that their 
diagnostic techniques are accurate at least 
80% of the time when they yield a positive 
finding, i.e., the false positive rate is low. A 
negative or even a borderline finding, (“good’ 
to average score) however, is invariably dis- 
regarded, if, in the physician’s opinion, this 
finding is at variance with the patient’s his- 
tory, his presenting complaints, his symptoms 
and other diagnostic data. With a negative 
(“good” score) finding, therefore, the phy- 
sician functions with a clinical relativism and 
a Clinical tolerance of variability and errors of 
prediction. 

When we apply our correlation statistics to 
measure the magnitude of the relationship be- 
tween a criterion and a predictor, the coeffi- 
cient thus obtained gives an average, over-all 
statement of differential predictions of the 
test. The result, being a statement of the 
weighted means of the selection ratios of the 
predictor classes Y,, Yo,...Y, on the criterion 
X, is to attenuate the validity coefficient. If 
the product-moment r is computed without 
drawing the scatterplot, the plot may or may 
not be heteroscedastic, and the analyst can- 
not tell whether there are differential predic- 
tions of the various Y classes. 


Fisher 


Guilford (1956), among others, has argued 
for the importance of scattergram inspection. 
With respect to organismic behavior, there- 
fore, the case of linear association and equal 
variance obtains and applies partially, but not 
throughout a predictor-criterion relationship. 
The case of curvilinear association and un- 
equal variance, however, also applies and, 
therefore, both cases are relevant and deserve 
our differential appreciation. 

It has been said many times before and in 
many different ways that the behavior of the 
adaptive, functioning organism is ordinarily 
highly complex. I think we mean by this a 
capacity for variable, substitutive, compensa- 
tory behaviors. Under the disruptive condi- 
tions of pathology or stress, for example, it is 
as if special homeostatic mechanisms provide 
a biologic and/or psychologic smoke screen of 
adaptiveness. These on-going, restorative proc- 
esses may be largely responsible for the high 
rate of “normal” responses or false negatives. 
It is perhaps no accident that many of our 
most useful diagnostic methods have been de- 
vised to assess pathology in status; i.e., when 
the camouflage is no longer impenetrable, or 
when adaptiveness has been or is curtailed 
beyond a certain point. I am referring here, 
in particular, to Binet’s original mental test— 
an extraordinarily fine screening device for 
mental deficiency.® It is precisely in the non- 
adaptive segment, then, that our predictors 
seem to prove themselves by virtue of their 
high rate of predictive accuracy. In this con- 
nection, it should be pointed out that they 
are not entirely without predictive power in 
the adaptive segment either, albeit negative 
errors lessen their accuracy. It seems, there- 
fore, that when predictor measures are extra- 
polated beyond that “certain point,” i.e., from 
“poor” to “good,” the multivariable complexi- 
ties of behavior multiply and interfere wit! 


our predictive accuracy. Hence, it appears 


that whatever validity a measure may possess 
for predicting adaptive behavior, it is likely 


6 Recently, Jones (1958) advanced the concept of 
polarity to explain the one-ended conceptual clarity 
of most psychologic He proposes a method 
for testing interitem homogeneity at either or both 
ends of a continuum to determine the degree of scale 
polarity 


scales 
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to be more accurate in predicting nonadaptive 
behavior. 


Perhaps the hidden determinant in all this; 


is the criterion dimension itself extending from 
the nonadaptive to the adaptive. It, too, ap- 
pears to have a partial range of certainty 
and “truth,” probably because more is known 
about the defining criterion points of non- 
adaptiveness than about those of adaptive be- 
havior. Our test construction procedures pro- 
vide us with reasonably good techniques and 
measures of criterion nonadaptive behavior. 
If, however, the twisted pear phenomenon 
possesses the degree of generality suggested 
by the data reviewed, then, there is reason 
to question the assumptions for extrapolating 
from the nonadaptive to the adaptive ex- 
tremes of predictor-criterion relationships. 

Is it possible that the observed partial 
curvilinearity and differential predictive vari- 
ance are merely artifacts of sampling (of our 
tests) or of our criterion determinations? Is 
the twisted pear solely a “clinical” phenome- 
non, or is it a more general characteristic of 
the prediction problem? Perhaps these ques- 
tions are unanswerable at the present time or 
perhaps we are dealing with a reality of or- 
ganismic behavior, one of the existentional 
dilemmas, to borrow a term from Erich 
Fromm, for the behavioral sciences. It is as 
if, on the one hand, there is general accept- 
ance of the dynamic nature of organismic 
adaptation and change, and the variability 
thereby induced, while, on the other hand, 
there is the coexisting pursuit of immutable 
validity coefficients as an attainable goal in 
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the business of understanding and predicting 
behavior. 


Received August 11, 1958 
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FACTOR ANALYSES OF RORSCHACH SCORING 
CATEGORIES AND FIRST RESPONSE 
TIMES IN NORMALS’ 


JULIUS WISHNER 


University of Pennsylvania 


The Scoring Categories 


The present studies were undertaken and 
completed prior to the appearance of the fac- 
tor analyses by Williams and Lawrence (1953, 
1954) and Consalvi and Canter (1957). The 
study by Borgatta and Eschenbach (1955) in 
particular raises serious questions as to the 
predictability of overt behavior from the Ror- 
schach test, as does the experiment of Holtz- 
man and Sells (1954). Nevertheless, the Ror- 
schach test continues to be used in the clini- 
cal setting, and there is continued interest in 
research results. 

Opportunity to check on Wittenborn’s 
(1950a) results of his factor analysis of the 
Rorschach scoring categories, taking advan- 
tage of suggestions stemming from Cronbach's 
(1949) critique of statistical procedures uti- 
lized by Rorschach researchers, was provided 
by the Rorschach data collected by Beck, 
Rabin, Thiesen, Molish, and Thetford (1950) 
on 157 subjects from a stratified normal popu- 
lation. These data also provided the oppor- 
tunity to determine if factors of verbal pro- 
ductivity and psychological efficiency sug- 
gested by previous investigations (Wishner: 
1948, 1953, 1955) were identifiable. Verbal 
productivity factors in the Rorschach have 


1 The original data for this study were collected 
under the supervision of Samuel J. Beck at the 
Michael Reese Hospital in Chicago. The author is 
indebted to S. J. Beck and the administrators of the 
Department of Psychiatry of the Michael Reese Hos- 
pital for making the raw data available. The major 
part of the statistical work was performed by Lillian 
Berg. The author is pleased to express his deep ap 
preciation to the Committee on the Advancement of 
Research of the University of Pennsylvania for its 
financial support of this project. Appreciation is also 
due Malcolm G. Preston and Howard Maher for 
helpful suggestions and consultation 


also been reported by Sen (1950), Fiske’ and 
Baughman (1953), Borgatta and Eschenbach 
(1955), and Wittenborn (1950a). 

From the point of view of populations 
studied, Wittenborn’s Yale undergraduates 
are the only nonpsychiatric subjects whose 
Rorschachs have been factor-analyzed. The 
present study deals with a nonpsychiatric 
population, less selected than Yale under- 
graduates, representing four general occupa- 
tional groups: executives and junior execu- 
tives, skilled workers, semiskilled, and un- 
skilled as reported elsewhere (Beck et al., 
1950). While the population studied by 
Eschenbach and Borgatta (1955) was also 
ostensibly normal, it did include military per- 
sonnel who had been incarcerated for breaches 
of discipline; in addition, their population, in 
general, can be presumed to be as selective 
as the Air Force is in selecting its personnel. 

From the standpoint of scoring and com- 
putational procedures, the present study is 
most like those of Eschenbach and Borgatta 
(1955). They dealt with the general problem 
of the dependence of the absolute number of 
responses in any particular scoring category 
on the total number of responses (R) by 
means of partial correlation. In this study, 
the scores were converted into percentages. 
Since many of the distributions were highly 
skewed, scattergrams were constructed of the 
relationship between the most highly skewed 
variables, and tests of linearity were made. 
Significant nonlinearity was found in one case, 
and this is discussed below. 


Computational Procedures, Results, and In- 
terpretation of Factors 
The population is described in detail by 
Beck et al. (1950). There were 71 males and 
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Table 1 


Rorschach Scoring Categories Analyzed 


Variable 
No. 
 : 
. W% 
“ D ¢ 
. Dd % 
5. MY 
. (C+ CF)% 
(C + CF + FOC)% 
. FC% 
. Sum C/R 
(Y+YF- 
11. (V ; 
12. (Y 
13. F% 
14. F+% 
15. H% 
16. AY 
17. N/R (N = The number of content 
categories utilized by the subject 
18. Z/R 
19. P& 
20. T/R (Time per response) 


86 females; the age range was 17 to 69, with 
a mean of 30.5 and an SD of 8.55. There 
were 39 executives and junior executives, 47 
skilled workers, 44 semiskilled, and 24 un- 
skilled. The subjects (Ss) were administered 
the Rorschach by two different examiners. 
Fiske and Baughman (1953) report that there 
were no systematic differences between the ex- 
aminers in total number of responses elicited, 
and that each tested the same proportion of 
the four vocational groups. 

The scoring categories chosen for analysis 
are shown in Table 1. It will be seen that, for 
the most part, these are the standard Ror- 
schach scoring categories as used by Beck. 
The C, Y, and V scores were broken up as 
they are in order to test again Wittenborn’s 
findings that FC has a factorial composition 
different from C and CF, to test the more 
general implication of his findings concerning 
the dominance of F, and to test the assertion 
implied by the scoring system that Y and V 
represent different psychological qualities. 

The product-moment intercorrelations of 
these scores were computed and are shown in 
Table 2. Blakeman’s tests of linearity were 
applied to 12 comparisons involving the most 
skewed distributions, and yielded significant 


nonlinearity only between 1% and P%. How- 
ever, since r of H% and P% is —.04, whereas 
n is only .33, and since inspection of scatter 
grams indicated no obvious curvilinearity in 
the other comparisons, it did not seem neces- 
sary to resort to transformations. 

These intercorrelations were then factor- 
analyzed by the centroid method, and the re- 
sults are shown in Table 3. As in most factor 
analyses of the Rorschach test, four factors 
emerge. 

Attempts to rotate these factors according 
to Thurstone’s criteria of simple structure did 
not meet with success. Plots of the factors 
against each other revealed that in most cases 
there was an approximately equal distribution 
of the tests in the four quadrants. Thus, there 
seemed to be no objective bases for preferring 
one rotation over another. 

It was decided, therefore, to make direct 
tests of the hypotheses underlying this study, 
as well as tests of general Rorschach hypothe 
ses. These hypotheses were: 


1. That there is a factor of psychological 
efficiency ; 

2. That the factorial composition of FC is 
different from that of C and CF; 

3. That there is a factor of verbal produc 
tivity. 


The fourth hypothesis, added after the first 
three had been tested, was: 


4. That there is a factor of compulsivity 


The procedure utilized in testing these hy 
potheses consisted of setting down the scores 
which are supposed to reflect the factor in 
volved, choosing the central score of this com 
plex, rotating the factors so that the loading 
on this central score is maximized, noting the 
loadings on the other scores, and then esti- 
mating the degree of agreement with the origi 
nal hypothesis. For example, in order to test 
the hypothesis that a factor of psychological 
efficiency is identifiable in the Rorschach, the 
factors were rotated through F%, 
previous study it was found that F‘ 
the highest coefficients of correlation 
general alertness (Wishner, 1953). It was hy 
pothesized that the factor would have load 
ings in F%, T/R, and A%, as well as sig 
nificant loadings of the opposite sign in M 


since in a 
showed 
with 





rz «FSO — SIO OFF csou— cou- Lt 891 ms? = ws oe 
Stl’ £80 ose” 910 Ott 861 88Z 90l'—- II 8tO £60 
cog" im = j Tk7 LLY Ls\ El 69F 867 96F 
867° - “— $00 6LI Lt 97 LO Lo¢ 
180 LOF OFZ 7£7Z Oo L102 
LSso sl £90 9SZ OLI 697 
7Z Isl ; | chi 
67t 6L9 
191 £97 
160 L380 7 900 
£67 687 Ll0 Flo 
OLS 1S6 ttO O7t 
66% S10 OS7Z 
£70 97t LSZ 
9t0 960 Lt — 
L£Z0'- 670 
¢t0'— 


> 
S 
= 
S 
_ 
x 
= 
— 
oe 
= 
3 
= 
= 


II Ol 6 8 9 ‘ON 
aqeuvA 
SILIOZIJV) SULIOIG YIVYISIOY JO SUOT}RIIII0II9}UT 


¢ AGEL 





Factor Analyses of the 


Table 3 


Unrotated Factor Loadings—Scoring Categories 


Variable 
No. IV 


3627 
3005 
2901 
.1362 
4561 
2011 
.2350 
.0509 
.2092 
3436 
.2959 
4301 
.2909 
~— 3180 
3947 
2561 
.1099 
3000 
.1991 
1533 


6417 
1027 
3520 
5476 
.6709 
.7446 
1.0199 
.2556 
9317 
5986 
5007 
8892 
1012 
4255 
.6738 
4657 
.2129 
.6622 
4401 
3094 


4288 
.7896 
5442 
4552 
6298 
8827 
4667 
0048 
3475 
2249 
6711 
0792 
2078 


2966 
3036 
4571 
3423 


C, Y, and V: The reader will recognize the 


similarity of this procedure to Eysenck’s 
(1950) method of criterion analysis. 


The final rotated factor loadings are shown 
in Table 4. With respect to the first hypothe- 
sis, Factor II’ is the one rotated so that its 
loadings on F% would be maximized. The 
hypothesis required that the loadings on 7/R 
and A% be of the same sign as F% and that 
the loadings on M,C, Y, and V be of opposite 
signs. If we adopt a convention of considering 
only factor loadings of .35 or higher in order 
to deal with at least 10% of the variance of 
a score, the requirements of the hypothesis 
seem to be met. This factor resembles one in- 
terpreted as representing perceptual control 
by Wittenborn (cf. Factor B in 1950b). Such 
an interpretation is perhaps closer to the data 
than the one offered here. At the same time, 
it is interesting that the Rorschach can yield 
a factor predicted on the basis of a view of 
psychological efficiency as the central concept 
in mental health (Wishner, 1955). 

Psychologically, this factor seems to reflect 
a continuum ranging from sensitivity to vari- 
ous nuances of the Rorschach cards (loadings 
of the same sign on M, C, Y and V, and F+) 
conforming to content most frequently re- 
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ported (F+), at one end, to deliberate (7/R) 
and relatively exclusive attention to form 
(F‘~) with animal content (4%), at the 
other end. At a higher level of abstraction, 
one might interpret this as flexibility—rigidity. 
Taking into account previously demonstrated 
relationships with physiological variables, it 
does not seem too farfetched, even if some- 
what speculative, to suppose that perceptual- 
flexibility-with-control may be the central ba- 
sis for motor efficiency (Wishner: 1953, 1955). 
However speculative, it is still true that it was 
just this notion which led to the prediction 
apparently confirmed. 

With respect to Hypothesis 2 concerning 
the factorial composition of C, FC, and M, 
rotations were made of Factors I’ and III, I’ 
and IV, and III and IV so as to maximize 
both M and FC alternately. None of these 
rotations produced a meaningful factor. It can 
be seen by an inspection of Table 4 that M 
and FC do not have the same factorial com- 
position, as shown by the fact that they do 
not have significant loadings on the same fac- 
tors, and indeed, they have loadings of op- 
posite signs on two of the factors in the cen- 
troid analysis, and on one of the factors in 


the rotated analysis. At the same time, the 


Rotated Factor Scoring Categories 


Variable 

No Iii” IV’ 
6825 0207 
2006 


0012 
1603 
1856 
0795 
6360 
3293 
4047 


2227 


Al81 
2544 
2278 

1113 
1594 

4224 

5050 
1874 
4644 

1740 

— 3872 
3280 
4121 

4012 
0544 
2793 
0669 
2370 
3312 


2791 


6410 
7019 
3518 
5470 
6704 
7440 
0189 
2554 
9308 
5984 
5004 
RRR7 
1004 
$253 
6732 
4653 
2127 
6615 
4396 
3092 


7373 
4149 
6611 


7272 


3055 
3021 
4369 
3504 —.5727 
4172 


2305 


6527 
3429 
3983 6384 
1503 
1281 
272 


3859 
3859 
0788 
2921 
1320 
0589 
6953 
Q284 
3358 
1896 
1754 


1768 


6298 
5726 
7884 
0616 
3190 
3794 
3522 
2953 
5663 
5366 


3073 


0536 
3989 
2070 
5123 
Q909 
1987 
1058 


3250 
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factorial composition of FC seems very simi- 
lar to C, C+CF, C+CF+FC, and Sum C/R. 
Thus, this aspect of Wittenborn’s results is 
not confirmed in this study. 

With respect to Hypothesis 3, Factors I’ 
and III and Factors I’ and IV were rotated 
through Dd%, and this yielded Factor I” 
which appears to be meaningful. The hy- 
pothesis required that there be high loadings 
on Dd%, M%, and C. This is seen, except 
that the loading on M% is not notably high. 
It is difficult, however, to exclude the possi- 
bility of artifact in that this may again rep- 
resent some part of the productivity factor. 
Thus, for example, there is high loading on 
R, V, and N, as well as on the other loca- 
tional variables. 

The final rotation was an attempt to repro- 
duce the productivity factor by rotating Fac- 
tor IV against III’ through R. The produc- 
tivity factor required that R, W, and Z have 
high loadings of the same sign, with A of op- 
posite sign, based on the relationships with 
various subtests of the Wechsler-Bellevue 


found previously (Wishner, 1948). While this 
general pattern is seen in Factor IV’, the 
loadings do not appear high enough to war- 


rant its interpretation as an intelligence fac- 
tor; indeed, Factor I” appears to fit the bill 
better. Thus, one might view both Factors I’ 
and IV’ as productivity factors, but having 
different psychological bases. Factor I’ may 
reflect productivity based on unusual frag- 
mentations of the blots (note loadings on D 
and Dd contrasted with loadings of opposite 
sign on W, P, and A). The productivity re- 
flected in Factor IV’ seems based on particu- 
lar stimulability by color. Thus, we may have 
two independent types of productivity: the 
Dd type and the C type. Presumably, some 
optimal combination of these would reflect 
intelligence as measured by standard psycho- 
metrics, and this may account for the vari- 
able findings concerning intelligence on the 
Rorschach. 

Factor III” remains to be interpreted. It 
has significant loadings on M and H, and 
loadings of opposite sign on C, Y, and per- 
haps N. This would appear. to indicate a 
tendency for individuals to predominate in 
one or the other, although there is no di- 
rectly inverse correlation between J/ and C 
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(see Table 1). In view of previously ambigu- 
ous findings in this sphere, it is unclear what 
psychological significance this factor may 
have. 


Discussion 


The results are not unlike those found 
previously, although a wholly different popu- 
lation from those in other studies was utilized. 
Almost all investigators since Wittenborn have 
found that the Rorschach yields four factors. 
Similarly, they have found a factor variously 
labeled as productivity, verbal intelligence, 
etc., while the other factors have proved am- 
biguous. The most thorough study by Eschen- 
bach and Borgatta (1955) indicates the in- 
adequacy of the Rorschach in the prediction 
of overt behavior. 

In all these respects, the present study is 
confirmatory, or at least not contradictory. 
There is a failure to confirm Wittenborn’s 
finding of similar factorial compositions for 
FC and M, which are different from C and 
CF; of course, no correlation with overt be- 
havior was attempted. 

On the positive side, there appears to be 
confirmation of a factor of psychological effi- 
ciency which would be related to mental 
health and illness as suggested in previous 
papers (Wishner: 1953, 1955). Clinically, 
after all, the Rorschach may well be con- 
ceived as having the estimate of mental ill- 
ness as its major function, despite Rorschach’s 
original denial that his test should have any- 
thing to do with psychiatric diagnosis. Thus, 
this confirmatory finding of a previous sug- 
gestion may be worthy of further investiga- 
tion. 

The A*® column of Table 4 indicates that 
the greater part of the variance of D%, FC%, 
F+%, A%, N/R, P%, and T/R is unac- 
counted for by these factors. The first four 
of these have received extensive attention 
both in the Rorschach literature and in clini- 
cal work, but their significance, insofar as fac- 
tor analyses can reveal it, seems in doubt. On 
the other hand, more than 70% of the vari- 
ance of W%, C treated indifferently as be- 
tween CF and FC, Y+V, and F% is ac- 
counted for. F% and the color and shading 
determinants may be conceived as two faces 
of the same coin. In general, the sign of the 
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factor loadings of F% are the opposite of 
those of Y, V, and C. This is practically 
guaranteed by the scoring procedures, since 
high C%, VY%, and V% necessarily places a 
limit on F%. It appears, therefore, that F% 
as a score reflecting perceptual control merits 
much greater attention than it generally re- 
ceives, particularly in the light of the concep- 
tion of psychological health here proposed. It 
is interesting that some 67% of the variance 
of M% is accounted for and that F% and 
M% tend to have similar factorial composi- 
tions. In view of past research on M (see 
particularly Singer, Wilensky, & McGraven, 
1956; Singer & Spohn, 1954; Williams & 
Lawrence, 1954; Wittenborn, 1950b) this 
lends weight to the interpretation of F% 
offered here. At the same time, the factorial 
composition of W% seems independent of 
F% and may merit further attention as a 
possible indicator of a significant personality 
trait. 

It must be recognized, of course, that fail- 
ure to achieve simple structure implies a lack 
of uniqueness to the set of rotated factors 
chosen for interpretation. Certainly, other ro- 
tations testing other hypotheses would have 
been possible. At the same time, the final 


solution is not arbitrary, and rests on a body 
of previous research on the Rorschach and 
on aspects of personality theory. 


First Response Times 

The first response times recorded for Beck's 
sample were also factor-analyzed. In addition, 
first response times in the Rorschach test were 
correlated with response times to the Street 
Gestalt Completion Test (1931) and visual 
reaction times, studies suggested by the re- 
sults of the factor analysis. 

Previous research on response times in the 
Rorschach test has raised serious questions 
concerning the tenability of such concepts as 
“color shock,” and “shading shock” (Dub- 
rovner, Von Lackum, & Jost, 1950; Lazarus, 
1949; Matarazzo & Mensh, 1952; Rabin & 
Sanderson, 1947; Siipola, 1950). There has 
been the positive suggestion that reaction 
time to the cards may be a function of the 
objective difficulty of each of the cards 
(Matarazzo & Mensh, 1952; Meer, 1955; 
Rabin & Sanderson, 1947). A search of the 
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Table 5 


Intercorrelations of First Response Times 
to the Rorschach Cards 
(Decimals omitted 


18 

34 

37-2 30 

28 tH 

46 3 27 

9 15 
10 20 


10 
19 
13 
38 17 40 
1) 26 $1 


54 

52 31 

34. «53 
42 34 


literature fails to reveal any previous factor 
analysis of first response times in the Ror- 
schach test. 

The object of this part of the study was to 
determine the number of factors necessary to 
account for the intercorrelations among first 
response times and to ascertain whether any 
such factors could be identified as difficulty 
or shock factors as had been suggested previ 
ously. 


Procedures and Results 

The raw data were the first response times 
to each of the ten Rorschach cards obtained 
from the sample described above, and the 
product-moment correlation coefficients among 
these are shown in Table 5. The results of 
the centroid analysis yielding two factors are 
shown in Table 6.*° Although the centroid 
solution is arbitrary, a plot of the factors 
shown in Table 6 reveals immediately that 
while the structure might be simplified slightly 
by a small clockwise rotation of Factor I 
(10-15 degrees), the essential nature of the 
factors could not be altered by any sensible 
rotation. It was decided, therefore, to treat 
the centroid analysis as the final 
without rotation. 

Since there are relatively high loadings on 
all cards, it is apparent that Factor I reflects 
some general factor of individual differences 
in response time. Factor II is extremely diffi- 
cult to interpret. It has probably significant 


solution, 


2For the major portion of this part of the sta 
tistical work, the author is indebted to Robert 
Downing 
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Table 6 


Centroid Factor Loadings—First Response Times 


loadings of opposite signs on Cards 5 and 8, 
but no immediately sensible interpretation is 
suggested, since Cards 5 and 8 do not have 
any uniquely polar relationship to each’ other 
which is phenomenally obvious or which has 
been suggested in the literature. It is interest- 
ing to note that the factors derived cannot 
account for much more than 50% of the vari- 
ance of response times to these cards. 

If Factor I is interpreted as a general 
factor, the problem arises whether it reflects 
general response speed of individual Ss, or 
whether it is peculiar to responses to inkblots, 
or perhaps even to the Rorschach inkblots in 
particular. To answer these questions the fol- 
lowing experiment was performed. 


The Relationship Between First Response 
Times on the Rorschach, Visual Reaction 
Time, and Response Times to Street Ge- 
stalt Completion Cards * 


Thirty male college students were tested by 
means of the Rorschach test, the Street Ge- 
stalt test, and a Vernier chronoscope. Each S 
was given the Street Gestalt Completion Test 
first. Card A was used as a demonstration 
card to orient the S to his task. He was told 
to find the correct answers to the remaining 
cards as quickly as possible. He was given a 
ready signal and then a turn signal for each 
of the six cards, and his responses were timed 
by means of a stop watch. 

The 10 Rorschach cards were then given 
to the S one at a time, with the same instruc- 


8 The writer is indebted to William Rosenblith who 
aided in the collection of these data. 


tions as used for the Street Gestalt test, i.e., 
to give a response as quickly as possible, tak- 
ing the whole inkblot into consideration. The 
S was not told that there were no correct an- 
swers in this test, since it was thought that 
this would make the situation somewhat less 
comparable to the Street Gestalt test. 

Finally, 10 readings on a Vernier chrono- 
scope were made of the S’s reaction time. E 
shielded his key from S’s view and told him 
to watch E’s pendulum. After a ready signal, 
the S was to release his pendulum as soon as 
he saw the other pendulum move. The need 
for speed was stressed. 

Through the use of these tasks, it was 
hoped to compare the Rorschach with a con- 
ceptually oriented task, as well as with a 
visual-motor reaction time. Thus, if there 
were some general response speed factors in- 
volving conceptual activity, there should be 
significant correlations between the Rorschach 
and Street tests; if the general factor were 
confined to sensory-motor reaction times, there 
should be a correlation with visual reaction 
times; or, of course, the comparisons might 
yield intercorrelations among all three tasks, 
indicating a very general response time factor. 


Results 


Mean response times to the Street cards 
and to the Rorschach cards yielded a rho of 

.23; the rho between reaction time and the 
Street Gestalt cards was —.03. Rhos between 
each of the 10 Rorschach cards and reaction 
time, and between each of the six Street Ge- 
stalt cards and reaction time, yielded no sig- 
nificant coefficients. It is clear, therefore, that 
no such general factor as cognition time or 
response time can be hypothesized to account 
for the general factor found in response times 
to the Rorschach test. It must be noted, of 
course, that the experimental procedure uti- 
lized with respect to response times on the 
Rorschach was unlike the clinical procedure; 
yet the instructions chosen, insofar as they 
were more homologous with the procedures in 
the visual reaction time and Street Gestalt 
experiments, should have favored higher in- 
tercorrelations among these tests. Factor I, 
therefore, appears to represent a factor more 
specifically related to inkblots and their ef- 
fect on response times, 
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Summary 


A matrix of intercorrelations of the scores 
on 157 Rorschach protocols originally col- 
lected from a stratified normal sample by 
Beck et al. (1950) was factor-analyzed. The 
hypotheses tested were: 1. That there is a 
factor of psychological efficiency; 2. That the 
factorial composition of FC is different from 
that of C and CF; 3. That there is a factor 
of verbal productivity; 4. That there is a 
factor of compulsivity. Support was adduced 
for Hypotheses 1 and 4; Hypothesis 3 also 
seems tenable, but with less confidence; Hy- 
pothesis 2 is not confirmed. Perceptual con- 
trol, as reflected in F% and M%, seemed to 
be the most important variable involved in 
the conception of psychological health on the 
Rorschach test. 

A factor analysis of the intercorrelations 
among the first response times to the 10 cards 
of the Rorschach test yielded two factors, one 
of which seemed interpretable as a general 
timing factor. In order to determine its de- 
gree of generality, 30 subjects were given the 
Rorschach and Street Gestalt tests, and a 


visual reaction time task. No significant in- 
tercorrelations were found. It was concluded, 


therefore, that the general factor seen in the 
response times of the Rorschach test cannot 
be conceptualized as a generalized cognition 
time factor, but is probably related more spe- 
cifically to inkblots and their specific effects 
on response times. 


Received August 11, 1958. 
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RECALLERS AND NONRECALLERS 
OF DREAMS 
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Dreaming, most psychologists agree, is a 
universal experience. Research with electri- 
cally recorded eye movements during sleep 
led Dement and Kleitman (1957) to conclude 
that “periods of . . . dreaming . . . are an 
intrinsic part of normal sleep” (p. 345), and 
that all people dream several times a night. 

Although it is likely that everyone dreams, 
many people report that they do not. The 
fact, then, that many persons have little or 
no recall of dreams raises interesting ques- 
tions. What are the distinguishing character- 
istics of persons who do, or do not, report 
that they dream? 

The present research was undertaken to 
examine some correlates of dream recall. Some 
hypotheses were drawn from Ramsey’s (1953) 
review of the literature on dreaming—that 
frequent recallers of dreams are younger, 
more intelligent, and more often women than 
men. Other hypotheses were derived from 
psychoanalytic theories (Fromm, 1951; Had- 
field, 1954) which hold that the dream makes 
possible an internal communication prohibited 
during consciousness because of the anxiety 
it would evoke. Two alternative propositions 
were put to test: that manifest anxiety is posi- 
tively related to the recalling of dreams, or 
that nonrecallers and frequent recallers are 
both more anxious than a less extreme group. 

The informal observation that many psy- 
choanalytic patients become progressively bet- 
ter able to remember dreams leads to the fol- 
lowing question: to what extent is this be- 
cause remembering is “the thing to do” under 
the circumstances, andito what extent because 
it implements their motivated search for self 
awareness? Manifest needs were selected from 
those defined by Edwards (1954), and the 


following predictions made. If conformity is 
the determining factor, then people who recall 
dreams when it is asked for or expected of 
them might be expected to have high needs 
for achievement, deference, and authority, 
and a low autonomy need; if, on the other 
hand, a search for self awareness motivates 
dream recall, recallers might be expected to 
have high needs for endurance, intraception, 
and succorance. 


Procedure 


The Ss were 42 teachers and school guid- 
ance counselors who were students in a six- 
week graduate summer course in parent coun- 
seling. There were 15 white women, 13 Negro 
women, and 14 white men. Information con- 
cerning age was collected in terms of five-year 
intervals; taking midpoints, the age range 
was from 22 to 52 years, with the mean at 
33.8 years. Men and women did not differ 
significantly in age, but only 32% of the 
women were or had been married, in contrast 
to 62% of the men. 

On the first day of class a personal data 
form was distributed. These sheets contained 
code numbers, which were thereafter the only 
identifications on material submitted for the 
study. Each S was given a printed booklet 
containing a detailed instruction sheet and 28 
perforated pages. 

The instructions stated that one page was 
to be submitted each day for 28 days, con- 
taining a record of all dreams or experiences 
of dreaming while asleep during the preceding 
24 hours. Ss were told to write the dreams 
as completely as possible, including incongrui- 
ties, vague impressions, and the like; they 
were told how to report associations or clarifi- 
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cations so as not to contaminate the dream 
report itself. Feelings during and after the 
dream were to be recorded; Ss were also in- 
structed to indicate whether the dream oc- 
curred while falling asleep, at an indetermi- 
nate time, or just before waking, or if it woke 
them during the night. Dreams were to be 
written immediately upon waking in the morn- 
ing, or on waking from a nap." 

Each report sheet contained a short re- 
minder of some of the more important in- 
structions. The rest of the page was blank so 
that the dream could be recorded. At the bot- 
tom of each sheet Ss could check, if appro- 
priate, “No dreams in this interval,” “Aware 
of having had dream but can’t remember con- 
tent,” or “Above (recorded dream) is all I 
remember, but I know there was more.” Each 
S submitted a page every day. In tabulation, 
an S was credited with having remembered a 
dream if he could remember any fragment of 
content. Although some Ss reported more than 
one dream during a single report period, fre- 
quency of recall was defined for this study as 
the number of report periods in which one or 
more dreams were recalled. 

During the second week of the experiment 
the Edwards Personal Preference Schedule 
(1954) was given to the Ss to complete at 
home. Early in the fourth week the IPAT 
Anxiety Scale (Cattell, 1957) and a 20-word 
vocabulary test (Thorndike, 1942) were ad- 
ministered in class. Briefly, the Edwards is an 
inventory of paired statements, matched for 
social acceptability, in which the forced-choice 
scores correspond to a need system based on 
the work of H. A. Murray. The IPAT Anx- 
iety Scale is a 40-item questionnaire based on 
factor analysis of a number of measures, and 
yielding split-half reliabilities of .84 and .91 
on different populations; validity studies have 
also yielded positive results. The multiple- 
choice vocabulary test was derived from the 
IER Intelligence Scale CAVD; it correlates 
.50 with a general intelligence factor among a 
sample of adult males (Thorndike, Norris, & 
Morrill, 1952) and .62 with the 1916 Binet 


1 The detailed instruction sheet has been deposited 
with the American Documentation Institute. Order 
Document No. 6015, remitting $1.25 
or $1.25 for photocopies 


for microfilm, 
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administered to 
1952). 


hospital patients (Miner, 


Results 

In all, 1176 reports’ were collected, of 
which 215 contained dreams. The range of 
individual recall frequency was from 0-19 
days; the mean frequency was 5.1, the me- 
dian 4.0, indicating that half the Ss recalled 
dreams only 14% of the time or less. Fifteen 
Ss recalled dreams only once or not at all; 
these constitute the Nonrecallers. Thirteen of 
the Ss recalled dreams 25% of the time or 
more; these constitute the Recallers. The Re- 
callers consisted of four men and nine women, 
of whom four were Negroes; the Nonrecallers 
contained five men and ten women, of whom 
five were Negroes. 

Sex and race. The mean frequency of dream 
recall for men was 4.57, and for women 5.39; 
while in the predicted direction, the difference 
(measured by ¢ test) was not significant. The 
mean for Negro women was 5.0, for white 
women 5.73, a nonsignificant difference. There 
was no sex difference in the constituency of 
the two groups. 

The means and standard deviations of the 
Recallers and Nonrecallers were computed 
on all variables; the differences between the 
group means were measured by ¢ tests. 
Table 1 presents these data. When a differ- 
ence was significant, the point biserial cor- 
relation is also presented to indicate the de- 
gree of relationship between the variable and 
dream recall. All comparisons are for the two 
extreme groups only. 

Age and manifest needs. As Table 1 re 
veals, age is not related to frequency of dream 
recall. Of the manifest needs, the only signifi- 
cant difference was for succorance; five of the 
other six hypothesized differences were in the 
predicted direction, but were insignificant. It 
seems most economical to conclude that the 
difference in succorance was a chance finding, 
and that there is no relationship between these 
needs as tested and recalling or not recalling 
dreams. 

Intelligence and manifest anxiety. Table 1 
reveals that Recallers were significantly higher 
than Nonrecallers both in intelligence and in 
anxiety. The mean anxiety score of the middle 
group (N = 14) in dream recall was 27.07, 
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‘ Table 1 


Mean Differences Between Recallers and Nonrecallers in Age, Vocabulary, Manifest Needs, 


Manifest Anxiety, anc 


Recallers 
(N = 13) 
M 


Variable SD 


Age, years 36.84 13.13 


Vocabulary 15.15 3.46 
Manifest anxiety 
Total 
Overt 
Covert 
Self sentiment” 
Ego strength» 
Guilt proneness 
Ergic tension 


34.92 
18.38 
16.54 
6.38 
5.00 
11.38 
8.08 


Manifest needs‘ 
Achievement 
Deference 
Exhibitionism 
Autonomy 
Intraception 
Succorance 
Endurance 


14.54 
15.62 
10.00 
10.77 
19.23 
14.53 


3.73 
3.91 
4.19 
2.83 
4.14 
4.07 
5.69 
8.85 


Contentless recall 3.19 


* One-tailed tests were applied to the predicted differences, tw 


contentless recall. 
> High scores indicate low self-sentiment and ego strength 
© For these variables, N 14 for Nonrecallers, as 
4 Hypothesis not confirmed; hence, difference not tested 


falling between rather than below the means 
of the Recallers and the Nonrecallers; any 
test, then, of the hypothesis that this group 
would have lower mean anxiety than the ex- 
treme groups is unnecessary. However, in or- 
der to determine whether the relationship be- 
tween the variables was significant over the 
whole distribution, the correlation ratio was 
computed and equaled 4.53, significant at bet- 
ter than the .05 level. It may be concluded 
that there is a positive relationship between 
manifest anxiety and dream recall. 


Discussion 


The findings of this study are, of course, 
limited because of the relatively small num- 
ber of Ss in each of the groups. Nevertheless, 
some of the findings are compara! 
of some earlier studies, and, where there are 
discrepancies, other factors may be 


, 4 _ 
le to those 


respon- 


MV 


one S did 1 


1 Contentless Recall 


Nonrecallers 


(N = 15 
SD 


7.94 


3 0] 


—— Wee ND 
t ' x 


» the subcategories o 


edule 


sible, such as methodological differences and 
the fact that the present sample is somewhat 
older than many of the other samples, as well 
as more homogeneous in occupation and more 
heterogeneous in race. That the intelligence 
and educational levels of the group were rela- 
tively high—the mean vocabulary score of 
the present sample was 13.84, equivalent to 
approximately the 90th percentiles of other 
groups tested (Hagen & Thorndike, 1955) 
and to the 73rd percentile of the ACE (Miner, 
1957)—limits the generality of the findings, 
but does not significantly differentiate this 
sample from previous samples in related 
studies. 

There was a striking variation from day to 
day in the number of people recalling dreams; 
the range was from 2 to 17 a day. A study 
of the Recallers’ daily variations revealed that 
there was no significant trend for different 
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days of the week. Coefficients of correlation 
were also computed between the number of 
nocturnal dream reports from the Recallers 
each day and the average temperature, hu- 
midity, and humiture for the corresponding 
period (10:00 p.m. to 7:00 a.m.); * the cor- 
relations were .16, .01, and .06, respectively. 
Previous studies have not accounted for either 
individual or group variability in recall over 
time. The present study has eliminated cer- 
tain factors, but leaves the observation un- 
explained. 

The greatest deviation from earlier findings 
is that neither sex (McElroy, 1952; Middle- 
ton, 1942) nor age (Kleitman, 1939) was as- 
sociated with ability to recall dreams. In view 
of this discrepancy, the present sample was 
compared with those of Middleton (1933; 
1942) to determine whether recall produc- 
tivity itself differed to any great extent. In 
spite of the fact that Middleton's findings 
were based on responses to a questionnaire 
requiring no documentation, the percentages 
of his Ss reporting experiences of dreaming 
and frequent or very frequent dreaming are 
in fairly close agreement with percentages of 
the present sample reporting at least one 
dream, or being categorized as Recallers, re- 
spectively. Compared with reports by Kleit- 
man (1939), the present sample was not 
atypical in the percentage stating that they 
never dream. The finding that Recallers score 
higher in verbal intelligence confirms earlier 
findings (Ramsey, 1953), although the rela- 
tionship was not a major one; the fact that 
dream reports were written, thus posing 
greater demands both motivationally and op- 
erationally upon the less verbally adept, sug- 
gests that, notwithstanding the earlier findings, 
the relationship may be partly artifactual. 

The results concerning the more dynamic 
factors of manifest needs and manifest anx- 
iety raise interesting theoretical questions, 
particularly with respect to the unknown re- 
lationship between manifest and covert mani- 
festations of the same characteristic. The need 
variables, for example, were initially thought 
of in two categories. In the first, thought of 


2 These data were obtained from the United States 
Weather Bureau. Humiture is a comfort-discomfort 
measure obtained by averaging temperature and hu- 
midity for a given time period 
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as situational, it was predicted that individu- 
als who had needs to do well (achievement), 
to “show off” (exhibitionism), and to please 
authority (deference and low autonomy) 
might be motivated to recall dreams because 
it was a Classroom instructor who was asking 
them to do so; in a more pervasive motiva- 
tional category, it was predicted that people 
with stronger needs to understand themselves 
(intraception), to seek help (succorance), 
and to stick with problems (endurance) 
would be found among the frequent Recallers, 
in accordance with the theory that remember- 
ing dreams offers a unique opportunity for 
insight (Fromm, 1951), creativity (Murphy, 
1947), and problem solution (Hadfield, 1954). 
The negative findings leave open the question 
raised initially concerning the motivation for 
increased recall of dreams during psychoana- 
lytic therapy. It may be that the instrument 


used is not a valid one, or that needs with the 


names and characteristics studied here are re- 
lated to dream recall but not at the manifest 
level. 

Similarly, the finding that manifest anxiety 
plays a substantial role in dream recall can 
mean either that anxious people are more 
urgently pressed to resolve the conflicts that 
dreams theoretically illustrate, that they have 
more such conflicts, or that persons who have 
erected fewer barriers between themselves and 
awareness of their anxiety are also more fully 
in touch with the rest of their internal ex- 
perience and, hence, recall dreams more fre- 
quently. The latter view receives some sup- 
port from the fact (Table 1) that the groups 
differed in overt but not covert anxiety, and 
that they did not differ in unresolved tensions 
(ergic tensions), but that the Nonrecallers 
were more able to “control and express”’ them 
realistically (Cattell, 1957, p. 5). 

The high correlation between remembering 
dreams and the contentless recall of dream- 
ing also supports the view that low dream re- 
call, and possibly low manifest anxiety, is re- 
lated to repression. While it is obvious that, 
in order to recall dreams, one must experi- 
ence dreaming, it does not necessarily follow 
that those who do not recall dreams would 


also not remember dreaming, especially since 
the latter is an “easier” 
categories were mutually exclusive. It 


task, and the two 


seems 
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likely that the almost absolute lack of recall 
of dreaming itself is due to factors associated 
with repression or control, and that it is there- 
fore symptomatic of a more general lack of 
awareness of ongoing internal processes. 


Summary 


Forty-two graduate students in education 
turned in reports on recalled dreams every 
day for four weeks. They also completed the 
Edwards Personal Preference Schedule, the 
IPAT Anxiety Scale, and a short multiple- 
choice vocabulary test. On the basis of the 
frequency with which they recalled dreams, 
subjects were divided into a group of Re- 
callers and a group of Nonrecallers. By means 
of ¢ tests applied to differences between the 
means of the subgroups on the variables 
tested, the following conclusions were reached: 


1. Men and women do not differ in fre- 
quency of dream recall. Small sample size may 
be a factor here, since the nonsignificant dif- 
ference was in the predicted direction. 

2. Dream Recallers are not younger than 
Nonrecallers within the age range covered in 
this study. 

3. Dream Recallers are more intelligent 
than Nonrecallers. 

4. There is a positive relationship between 
manifest anxiety and frequency of dream re- 
call. The findings also suggested that the dif- 
ference between Recallers and Nonrecallers 
was in overt rather than covert anxiety, 
and that, although they do not differ in 
unresolved tensions, Recallers have less ego 
strength than Nonrecallers. 

5. There is no relationship between the 
frequency of recalling dreams and manifest 
needs for achievement, deference, exhibition- 
ism, intraception, succorance, endurance, or 
autonomy. 

6. There is no relationship between diurnal 
variations in dream recall and variations in 
temperature, humidity, or humiture. 
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7. Contentless recall of dreaming is posi- 
tively related to recall of dreams. 


The study was based upon the assumption 
that dreaming is a universal process. On the 
basis of the present findings, variations in 
ability to recall dreams or dreaming were dis- 
cussed in terms of a repressive factor operat- 
ing most successfully in total Nonrecallers. 
Attention was called to the limitations of the 
study. 

Received August 15, 1958. 

REFERENCES 

CatTreL_t, R. Handbook for the IPAT Anxiety Scale 
(self analysis form). Champaign, IIl.: Instit. Per- 
sonality & Ability Testing, 1957. 

Dement, W., & Kuierrman, N. The relation of eye 
movements during sleep to dream activity: An ob- 
jective method for the study of dreaming. J. exp. 
Psychol., 1957, 53, 339-346. 

Epwarps, A. L. Manual for the Edwards Personal 
Preference Schedule. New York: Psychological 
Corp., 1954. 

Fromm, E. The 
Rhinehart, 1951. 

Haprietp, J. A. Dreams and nightmares. London: 
Penguin, 1954. 

Hacen, Exvizasetu, & THornpike, R. L. Normative 
test data for adult males obtained by house-to- 
house testing. J. educ. Psychol., 1955, 46, 207-216 

KLeITMan, N. Sleep and wakefulness. Chicago: Uni 
ver. Chicago Press, 1939 

McE roy, W. A. The frequency of dreams 
Bull. Brit. psychol. Soc., 1952, 3, 91-94 

Mippieton, W. C. Nocturnal dreams. Sci. 
1933, 37, 460-464. 

Mrippreton, W. C. The frequency with which a group 
of unselected college students 
dreaming and color hearing. J. 
1942, 22, 221-229. 

Miner, J. B. Intelligence in the United States. New 
York: Springer, 1957. 

Murpny, G. Personality. New York: Harper, 1947 

Ramsey, G. V. Studies of dreaming. Psychol. Bull., 
1953, 50, 423-455. 

THornpikeE, R. L. Two screening tests of verbal in- 
telligence. J. appl. Psychol., 1942, 26, 128-135. 

TuHornvike, R. L., Norris, R., & Morrir, C. § 
General aptitude test battery scores in a regional 
sample. USAF Hum. Resour. Res. Cent., Res. Note, 
1952, No. 52-16. 


forgotten language. New York 


Quart 


Mon » 


experience color 
genet. Psychol, 





Journal of Consulting Psychology 
Vol. 23, No. 5, 1959 


INTERRELATIONSHIPS AMONG MMPI MEASURES 
OF DISSIMULATION UNDER STANDARD AND 
SOCIAL DESIRABILITY INSTRUCTIONS ' 


JERRY S. WIGGINS 


Stanford University 


Current studies of what Jackson and Mes- 
sick (1958) have recently termed “stylistic” 
aspects of responses to personality inventories 
have devoted considerable attention to the 
“social desirability” component (Corah, Feld- 
man, Cohen, Gruen, Meadow, & Ringwall, 
1958; Edwards: 1954, 1957; Fordyce, 1956; 
Hanley: 1956, 1957; Hillmer, 1958; Rosen, 
1956; Voas, 1956; Wiggins & Rumrill, 1959). 
Social desirability response style may be de- 
fined as a general tendency to endorse per- 
sonality inventory items that are judged to be 
socially acceptable by people in general (Ed- 
wards, 1957,. Measurement of this response 
style has been indirect, at best. The standard 
method of demonstrating its presence has 
been to correlate item endorsement frequency 
with independently rated item social desir- 
ability in groups of subjects taking a per- 
sonality inventory (Edwards, 1957; Hanley, 
1956). Such studies have demonstrated that 
about 76% of the variance involved in group 
endorsement may be accounted for by item 
social desirability values alone. This approach 
provides little information about the extent to 
which individuals are answering in terms of 
item social desirability, and the reported 
relationship appears to be so omnipresent 
among self-report devices as to be of little 
practical significance (Corah et al., 1958; 
Hillmer, 1958; Jackson & Messick, 1958). 

A more direct approach to the measurement 
of this response style is represented by the 
attempts to develop social desirability dis- 
simulation “scales” which, in principle, serve 

1 This investigation was supported in part by a re- 
search grant, M-2215, from the National Institute of 


Mental Health of the National Institutes of Health, 
Public Health Service 


to identify individual dissemblers (Edwards, 
1957; Hanley, 1957; Wiggins & Rumrill, 
1959). Although not lacking in face validity 
or theoretical relevance, these scales lack evi- 
dence regarding their empirical (predictive) 
validity. Thus, the social desirability scale of 
Edwards (1957) (hereafter called En*) was 
constructed from rational sortings of 10 
judges. Evidence of its validity is its high 
correlations with the MMPI clinical scales 
and the K scale (Fordyce, 1956) and its low 
correlations with the Edwards (1954) EPPS 
scales which are presumably free of social de- 
sirability. The scale of Hanley (Ex) is more 
elaborately derived, although its validity again 
appears to rest on correlations with the MMPI 
scales, the K scale and the En scale (Hanley, 
1957). 

The validity of these scales becomes an im- 
portant consideration when the argument is 
reversed and the scales are used to show the 
susceptibility of the MMPI to social de- 
sirability response style (Edwards, 1957; 
Fordyce, 1956; Hanley, 1957) or the rela- 
tive freedom of the EPPS from such tend 
encies (Edwards: 1954, 1957). The possibil- 
ity exists that these scales (including K) are 
not adequate empirical measures of social de- 
sirability and cannot be considered as useful 
operational definitions of this tendency. In 
addition, there are reasons to believe that 
other stylistic components, such as response 
acquiescence, may be involved in these scales 
so that interpretations of their measurement 
properties must be highly qualified (Jackson 
called SD but is here re 
ferred to as En to indicate the 39-item version and 


to avoid confusion with other scales in the present 
study. 


2 This scale is usually 
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& Messick, 1958; Wiggins & Rumrill, 1959). 
These possibilities may be investigated by re- 
course to the method of contrasting groups 
which estimates the actual efficiency of these 
scales in identifying social desirability dis- 
semblers. 

The method of contrasting groups has previ- 
ously been used to estimate the efficiency of 
existing MMPI measures of “defensiveness” 
and “plus-getting” (Cofer, Chance, & Judson, 
1949; Gough: 1950, 1954; Grayson & Olinger, 
1957; Hunt, 1948; McKinley, Hathaway, & 
Meehl, 1948; Meehl & Hathaway, 1946; 
Rosen, 1956; Schmidt, 1948) to estimate the 
fakability of certain scales (Benton, 1945) 
and to develop new measures of “dissimula- 
tion” (Gough, 1954) and “defensiveness” 
(Cofer et al., 1949). It would therefore seem 
to be the method of choice for evaluating cur- 
rent measures of “social desirability” response 
tendencies. The traditional procedure in these 
“role-playing” studies has been to contrast 
the protocols of small groups of subjects who 


have taken the MMPI under standard and | 


one or more “fake” instructions. That this 
procedure introduces elements not present in 
the population to which results are general- 
ized is suggested by the studies of Grayson 
and Olinger (1957) and Voas (1956). Gray- 
son (1957) reports patients’ comments such 
as: “Well, I just put down the opposite of 
what I did yesterday” (p. 75). Voas (1956) 
presents clear evidence that contiguity in 
time between self and “socially acceptable” 
answering tends to increase honesty of self- 
report—which would effect a counterbalanced 
design. The present study was designed with 
the recognition that the population about 
which we would generalize contains subgroups 
of high dissimulators and relatively honest 
people—rather than people taking the MMPI 
under several conditions. It therefore employs 
the method of contrasting two large groups 
randomly selected from a homogeneous popu- 
lation. 

An appropriate criterion group for social 
desirability faking is difficult to obtain for, as 
Edwards (1957) has pointed out, the extent 
of social desirability faking in a “normal” 
group is unknown. This shortcoming is not 
peculiar to social desirability response style, 
however, and it seems reasonable to assume 
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that a relatively uniform group of high social 
desirability responders may be created by 
special instructions to the subjects. Such a 
group would differ in the amount of this tend- 
ency and could, in principle, serve as a cri- 
terion group for distinguishing an unusual 
amount of social desirability response style 
from a small amount. Scales which purport 
to measure social desirability response style 
should presumably achieve a certain amount 
of success in differentiating such a group from 
a “normal” group. 

The present study is concerned with the 
empirical success of several MMPI measures 
of “social desirability,” “defensiveness,” “dis- 
simulation,” and “response bias” in distin- 
guishing groups of social desirability role- 
players from comparable control groups. 
Attention is also focused on the interrela- 
tionships of these measures under both dis- 
simulation and standard conditions. 


Method 


The full scale MMPI was administered to 
440 undergraduate men and women under one 
of two sets of instructions. The 190 Ss in the 
control group received the standard set of in- 
structions that is printed on the front of the 
booklet. The 250 Ss in the experimental group 
received a modified set of instructions that 
was stapled over the standard instructions on 
the front of the booklet. The modified 
structions read, in part: 


in- 


Read each statement and decide whether People in 
General would consider a true or a false answer to 
be more desirable. You not asked whether the 
statement is true or false as applied to you. Rather 
you are asked to decide which answer you think 
People in General would consider to be more de 
sirable 


are 


These instructions were supplemented by an 
explanation that the judgment to be made 
was of the general values of American culture 
rather than of any particular subgroup and 
that the hypothetical respondent was to be 
considered as of the same sex as the judge. 

The protocols were scored for the standard 
MMPI clinical scales, including Welsh’s fac- 
tor scales. In addition, 11 special scales were 
scored which are presumably measures of 
dissimulation, social desirability, or response 
bias. These scales are briefly described below. 
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L, F, K: These are the well-known validity scales 
of the MMPI (Meehl & Hathaway, 1946). LZ is a 
15-item rational measure of Ss’ tendencies to give 
an unrealistically favorable picture of themselves. F 
is a 64-item rational measure of malingering which 
consists of items scored in a direction that is rarely 
chosen by normals. K is a 30-item empirical measure 
of “defensiveness” or “minus-getting.” Twenty-two 
items were selected which differentiated patients with 
high Z scores and normal profiles from a comparable 
group of patients with abnormal profiles. To thes 
items were added eight items which remained un 
changed under role-playing instructions in normal 
groups and which also differentiated 
turbed patients from normals 

En: Refers to the 39-item 
ability 
Fordyce 


severely dis 


rational social desir 
developed by Edwards (1957) and 
(1956) and usually designated SD. Ten 
judges were instructed to answer 149 items (from 
the L, F, K, and Taylor Anxiety scales) in such a 
way as to give the most socially desirable picture of 
themselves (Edwards, 1958). There was unanimous 
agreement on 79 items which were subsequently re- 
duced to 39 items by item analysis 
in the desirable direction 

Ex: Refers to the 26-item rational social desir- 
ability scale developed by Hanley (1957). A pool of 
53 items (which 36 to 64% of 
Hathaway’s college students) was given to 92 judges 
to rate on a 9 point scale of social desirability. A 
statistical criterion of high- and low-rated items was 
set, and the 26 items that survived were keyed in 
the desirable direction. 

Sd-A, Sd-R: Refer to the 39- and 40-item rational 
social desirability scales developed from Welsh’s Fac 
tor Scales A and R (Wiggins & Rumrill, 1959). The 
combined pool of 79 items was rated for both “true” 
and “false” responses on a 7-point scale by a total of 
181 judges. Sd-A is a pool of low-rated items, and 
Sd-R is a pool of neutral or moderate items. Both 
scales are keyed in the desirable direction 

Cof: Refers to the 34-item empirical “lie” 
developed by Cofer and associates (1949) 


ot ale 


Items are keyed 


were endorsed by 


ot ak 
This scale 
consists of those reliable items which were unchanged 
under fake-bad instructions but were changed under 
fake-good instructions in a design utilizing a total 
of 81 Ss. Two judges agreed that this item pool rep 
statements of low desirability valu 
Items are keyed in a 
validations have not 


resents soc ial 
that are 


desirable direction. Cross 


frequently endorsed 
been 
reported 

Ds: Refers to the 74-item empirical 
tion” scale developed by 


“dissimula 
Gough (1954) in his study 
of misconceptions about neuroticism. This scale con 
sists of those items which differentiate genuine neu 
rotics and normals from Ss instructed to fake neu 
4 group of 111 dissemblers was contrasted 
with 176 patients. Items are keyed in the direction 
of neurotic dissimulation. Extensive cross-validational 


data are rept rted 


roticism 


B: Refers to the 63-item rational “response bias” 
scale developed by Fricke (1957). This scale consists 
of those non-K items endorsed by 40 to 60% of 


Hathaway's combined college and 


normal groups 
These items are arbitrarily keyed “true” and pro 
vide a measure of 
encies. 

Sd: Refers to a 40-item empirical social desirability 
scale developed in the 


acquiescent or response-bias tend 


course of the present study 


This scale consists of those items which discriminat 
Ss instructed to answer in a socially desirable dire« 
control group. A group of 
contrasted with 14 
keyed in a 


validational data is 


tion trom a comparab! 


178 dissemblers was controls 
Items are 


Cross 


cially desirable direction 


availabk 
Results and Discussion 
Clinical Scales 


Comparisons of mean raw scores on the 
clinical scales between the dissimulation and 
control groups were made separately for men 
and women by means of ¢ tests. In contrasting 
the mean scale scores of the 144 experimental 
men with the 105 control men, significant re- 
ductions at the .001 level were found on Si, 
Hy, and R (in order of 
tude). At the .01 level, a decrease occurred 
on Pt and an increase on Ma. Welsh’s Scales 
1 and D were found to have decreased sig- 
nificantly at the .05 level. To the college male 
social desirability seems to involve social ex- 


decreasing magni- 


traversion, high activity, and an absence of 
anxiety and somatic complaints. (The last in- 
ference comes from the fact that Hy-subtl 
scores tended to be almost identical under the 
two conditions, 
curred on Hy.) 


while reliable differences oc- 


Larger and more numerous differences were 
found in contrasting the 106 experimental 
women with the 85 control Signifi 
cant reductions in mean score were found on 
Si, Mf, Pt, A, R, Hy, and Hs—all at less than 
001. At the .01 level, decreases occurred on 
D, Sc, and Pd. College women seem to ascribe 
more stringent and widespread criteria of so 


women. 


cial desirability to our culture (as our culture 
does to college women). Again we find social 
extraversion and an absence of anxiety and 
somatic complaints as socially desirable—this 
time without the associated high activity level. 
More pronounced among women than among 
men are the decreases in deviant thinking and 
deviant social attitudes found in Sc and Pd 
The latter two were also tendencies in men, 


Description of scale de 
a later publication 


elopment i 
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Table 1 


Mean Raw Scores on Dissimulation Measures Under Standard and Social Desirability Instructions 


Group , 2 Sd-A Sd-R Cof Ds B 


Experimental 


Men (N = 144) 


29.73 20.25 28.01 
4.69 7 6.28 


Control 26.91 26.46 


; . 12.06 
Men (N = 105) F 3. 35S 4 3.3 3.75 3. 6.12 


3.59 
17.89 
001 


Experimental 
Women (V = 106) 


27.13 


6.69 


Control 29.40 12.41 


Women(N=85) : ; : 4.82 2.98 . 3: 5.: 5.8 3.97 


5.19 9.28 82 7.59 13.38 17.90 


001 001 001 001 O01 001 


although not statistically reliable ones. A 
striking difference between experimental and 
control women which did not occur between 
men was found on the Mf scale. By implica- 
tion, college women tend to see themselves as 
more masculine than our culture would; have 
them. 

These clinical scale differences are highly 
similar to those reported by Rosen (1956) in 
a situation in which the same Ss responded 
in terms of self-rating and social desirability. 
Earlier reports of normal Ss faking good on 
the MMPI (Cofer et al., 1949; Hunt, Carp, 
Winder, & Kantor, 1948; McKinley et al., 
1948; Meehl & Hathaway, 1946) tended to 
minimize scale differences. They emphasized 
that the fake-good profile is slightly lower, 
but of such similar shape that it cannot be 
discriminated from other normal profiles. The 
present findings (as well as those of Rosen) 
in no way contradict this earlier statement. 


Validity Scales 


Table 1 presents the mean raw scores on 
the dissimulation measures under standard 
and role-playing instructions for men and 
women treated separately. The reliable differ- 
ences in L scale scores for both men and 
women are in accord with those reported by 
Cofer et al. (1949) and by Rosen (1956). 
Mean F scores under the two conditions were 


virtually identical for men and highly similar 
for women, which further confirms the ob- 
servation that F is of no value in detecting 
positive malingering (Cofer et al., 1949; 
Gough, 1950; Hunt, 1948; Meehl & Hatha- 
way, 1946). Although a reliable difference 
in mean K raw score occurred between the 
groups of women, K scores were almost identi- 
cal in the two groups of men. The additive 
combination of Z and K suggested by Cofer 
would contribute nothing further to the de- 
tection of male dissemblers in the present 
study. Cofer reports differences in K for a 
mixed group, and Rosen found differences for 
the sexes treated separately. However, an ear- 
lier study reported in McKinley et al. (1948) 
found no differences in K for either of the 
sexes treated separately. McKinley suggests 
that no differences were found on K because 
of some factor in the experimental situation 
that resulted in exceptionally high Ks under 
standard conditions. Apparently, this factor 
was not operative in the present design, since 
the raw K scores under control conditions for 
Stanford men and women (Table 1) do not 
significantly depart from the norms given by 
McKinley for college men and women (X 
16:10, ¢ = 5.15: A= 4.20). The 
actual screening efficiency of ZL and K in 
identifying college dissemblers is discussed in 
the next section. 


15.58, o 
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Special Scales 


The remaining columns of Table 1 show 
the special scale differences under standard 
and role-playing instructions for both sexes. 
For the men, Sd, Cof, Ex, and Sd-R show 
a reliable difference at the .001 level. Sd-A 
was different at the .01 level and En at the 
OS level. The response bias scale (B) ap- 
proached, but did not attain, significance in 
this group. For the women, Sd, Cof, Ex, 
Sd-R, Sd-A, and En were all reliably different 
at the .001 level. 

The actual screening efficiency of these 
measures was assessed by combining the sam- 
ples into a group of 250 dissimulators and a 
group of 190 controls and calculating the pro- 
portion of hits and misses at various cutting 
scores for each scale. Space limitations pro- 
hibit the presentation of the complete dis- 
tribution for each of the scales. However, 
Table 2 shows the degree of successful classi- 
fication involved at the cutting score which 
minimizes both false positives and false nega- 
tives. 

A certain amount of overlap between the 
distributions of dissemblers and controls is 
to be expected on the grounds that an as yet 
unknown proportion of dissemblers will be 
found in any control group. However, it is 
clear from Table 2 that a group of known 
(instructed) dissemblers may be differenti- 
ated from such a control group with better 
than chance success. The advantage of em- 
pirical scales in this enterprise is also appar- 
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ent. The success of Sd in identifying dis- 
semblers is, no doubt, somewhat spurious in 
that the scale was developed on part (72%) 
of the present sample. To test the extent of 
spurious inflation, Sd was scored in independ- 
ent samples of 50 control college men and 72 
college men instructed to answer in terms of 
social desirability. At a cutting score of 21 or 
above, the scale successfully identified 68% 
of the dissemblers and 100% of the controls 
(phi = .683). The success of Sd in the pres- 
ent study is therefore more than suggestive. 

The empirical scale of Cofer (Cof), which 
was developed 10 years ago on a quite dif- 
ferent college sample, is able to correctly 
identify 65% of the dissemblers and almost 
all of the authentic records at a cutting score 
of 20 or above. Cofer et al. (1949) reported 
correct identification of 86% of his dis- 
semblers and 96% of his authentic records 
(phi = .641). This was accomplished by re- 
scoring the same records from which the scale 
was developed. The shrinkage in percentage 
of correct identification in the present study 
would be expected on cross-validation. De- 
creasing the cutting score in the present sam- 
ple would, of course, increase the proportion 
of correct identification of dissemblers at the 
expense of increased false positives—which 
could be well afforded. Gough’s (1950) ear- 
lier observation that Cofer’s scale “gives 
promise of being a very interesting addition 
to the pool of MMPI keys” (p. 409) is well 
borne out by the present results. 


Table 2 


Screening Efficiency of Dissimulation Measures 


Simulated Records Authentic Records 
(N = 250) N = 190 


Prop. 
Called 
Authen 


Prop 
Called 
Simul 


Prop 
Called 
Authen 


Prop 
Called 
Simul 


Cutting 

Score 
21 or above 26 02 
04 
09 
09 
14 
S 
15 
11 


> wy 


20 or above 35 

6 or above 38 
18 or above 47 
31 or above 48 
32 or above 37 
36 or above 54 
22 or above 71 


=~ 


aN 
mM wh ue 


a 
oa w 
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In accord with expectations, the L scale is 
indicative of dissimulation. The extent to which 
identification of simulated protocols could be 
improved is limited by the low cutoff in this 
15-item scale. The rational social desirability 
scales of Hanley (Ex) and Wiggins and Rum- 
rill (Sd-R, Sd-A) appear to identify dissimu- 
lated records at slightly better than chance 
level while minimizing the proportions of false 
positives rather successfully. Lower cutting 
scores might be of some practical value in 
these scales. 

The rational scale of Edwards (En) which 
is probably the most widely used of current 
measures of “social desirability” misclassifies 
more dissemblers than it correctly identifies. 
Considering that the dissemblers in the pres- 
ent study were instructed to respond in terms 
of “social desirability” as defined by Edwards 
(1957), the En scale is clearly not what it has 
been assumed to be. 

The present findings, with respect to K, 
add more negative evidence concerning the 
meaning of K in populations other than the 
original (Hunt et al., 1948; Schmidt, 1948; 
Tyler & Michaelis, 1953; Welsh & Dahl- 
strom, 1956). It is clear that high K, in a 
college sample, is not prima facie evidence of 
social desirability faking. It should be noted, 
however, that low K is a fairly good indica- 
tion of the absence of this response tendency. 
Neither of these statements contradicts cur- 
rent clinical practices in regard to the inter- 
pretation of K (Welsh & Dahlstrom, 1956). 


Response Bias 


The need for differentiating the different 
“stylistic” components that enter into inven- 
tory response measures has been emphasized 
by Jackson and Messick (1958) and by Web- 
ster (1958). In comparing measures of social 
desirability response style, it seems appro- 
priate to estimate the extent to which other 
response styles may be operative in account- 
ing for the same variance. One such response 
style is that of “acquiescence” which has been 
studied in other contexts (Jackson & Messick, 
1958). Fricke’s “response bias” scale (B) 
may be considered a measure of acquiescence 
since it consists of 63 items of high ‘“am- 
biguity” that have been arbitrarily scored 
“true.” The failure of B to discriminate dis- 
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Table 3 
Correlations of Dissimulation Measures with the 


Response Bias Scale Under Standard and 
Social Desirability Instructions* 


Men 


Women 
Social 
Desira 
bility 
(N =144) 


Social 
Desira 

bility 
(N =85) 


Standard 
(N = 106) 


Standard 
V=105 


373 — 248 103 -197 
084 426 172 -247 
445 461 320 — 500 
432 — 501 449 532 
126 305 066 —255 


3435 -672 633 634 
594 607 588 635 
~ 648 —692 615 — 662 


100; an ry of .257 is significant at the .01 level 


semblers from nondissemblers (Table 2) sug- 
gests that it represents a stylistic component 
Mainly irrelevant to the task of identifying 
social desirability dissemblers. There are, un- 
doubtedly, factors other than “response bias” 
which contribute to the variance of the B 
scale, but it would seem to serve as the best 
available operational definition of response 
bias tendencies. For this reason, the eight 
measures of dissimulation were correlated 
with the “response bias” scale under both 
standard and role-playing conditions. Table 3 
presents these correlations. 

Dissimulation scales such as Sd, Sd-R, and 
Cofer’s scale (Cof) are relatively free from 
this bias, while K and the scales of Edwards 
(En) and Hanley (Ex) are more subject to 
its effects. The response bias properties of K 
have been noted by several authors (see Jack- 
son & Messick, 1958), and the possibility of 
response bias in En and Ex was noted in an 
earlier paper by the present author (Wiggins 
& Rumrill, 1959). 

In addition to the individual differences in 
susceptibility to response bias among the dis- 
simulation measures, it is interesting to note 
the different degrees of this set that appear to 
operate under the two conditions of adminis- 
tration. In general, it appears that dissimula- 
tion instructions act to increase the operation 
of the response bias factor. This is in line 
with Hanley’s suggestion that acquiescent in- 
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dividuals might have a tendency to be defen- 
sive as well (Hanley, 1957). 


Intercorrelations Among Dissimulation Meas- 
ures 


Table 4 shows the intercorrelations among 
dissimulation measures for the 105 men under 
standard instructions (lower left) and the 144 
men under experimental instructions (upper 
right). Dissimulation scales which are slightly 
or insignificantly related under standard con- 
ditions become highly interrelated under in- 
structions to answer in terms of social desir- 
ability. It might be hypothesized that a dis- 
simulation set increases both response bias 
(denial) and social desirability tendencies. 
Since all of the scales appear to measure both 
of these tendencies in varying proportions, 
the effect of dissimulation instructions might 
be to increase the 
measures. 


intercorrelations among 

It thus becomes difficult to choose between 
the scales under dissimulation instructions. 
Some choice is possible under standard in- 
structions, however. Here the fallacy of de- 
veloping a rational measure based on the cor- 
relation with a presumed measure of social 
desirability, such as K, becomes apparent. In 
general, the higher the correlation with K, the 
less empirically effective is the measure of so- 
cial desirability. This could be due to the fact 
that K contributes response bias and other 
variance irrelevant to the identification of so- 
cial desirability dissemblers (Hanley, 1957). 
The clear exception to this trend is found 
in Sd-R, which is not an effective empirical 


Table 4 


Intercorrelations Among Dissimulation Measures { 
105 Control Men (lower left) and 144 
Experimental Men (upper right)* 


Sd of v Sd-R Sd-A En A 


720 596 590 665 
775 756 755 711 783 
798 582 601 572 745 
646 702 714 848 
321 629 633 606 
601 X 928 818 


725 ' 803 778 
696 659 770 


s significant at the .01 level 


measure of social desirability but which is so 
free of response bias (Table 3) as to be virtu- 
ally unrelated to K under standard instruc- 
tions. 

The scale which is most similar to Sd, un- 
der standard instructions, is the scale of Cofer 
(Cof). This similarity is not surprising con- 
sidering the manner in which the scales were 
developed. Cofer’s scale is based on items 
which were changed under fake-good instruc- 
tions, with the additional property that they 
were also left unchanged under fake-bad in- 
structions. The development of Sd was, in 
part, a cross-validation of Cofer’s procedure 
and, as might be expected, 14 of Cofer’s 
items appear in the Sd scale. The two scales 
appear to be equally free of response bias 
(Table 3) and almost equivalent in their 
ability to screen social desirability dissemblers 
(Table 2). It appears that Sd and Cof are 
the best available experimental scales for in- 
vestigating social desirability response tend- 
encies in the MMPI. 

Correlations between Hanley’s scale (Ex) 
and En, K, L, and Cof are similar to those 
reported by Hanley (1957). It should be 
noted that, on the basis of these correlations, 
Hanley felt that Cof was of little value com- 
pared with K, En, and Ex (p. 394). Ed- 
wards’ scale (En) is seen to be highly simi 
lar to Sd-A, as was suggested in an earlier 
paper (Wiggins & Rumrill, 1959). It is also 
substantially related to K, which it 
was partially “validated.” 


against 


Social Desirability in the MMPI 


Studies by Fordyce (1956), Hanley (1957), 
and Edwards (1957) have done much toward 
creating the impression that the MMPI clini- 
cal scales are heavily saturated with social 
desirability response style. Such “social de 
sirability” is a matter of definition, however, 
and these studies have rested on the correla 
tions of En and Ex with the clinical scales as 
evidence of the susceptibility of the clinical 
scales to this bias. The correlations in Table 5 
are especially relevant to this line of reason- 
ing. The columns represent the dissimulation 
measures in decreasing order of their effec 
tiveness in identifying known (instructed) so- 
cial desirability dissemblers. The rows repre- 
sent the MMPI scales, and the data are those 
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Table 5 
Correlations of Dissimulation Measures and MMPI Clinical Scales for 
105 Control Men Under Standard Instructions* 


Dissimulation Measures 


Clinical es 
Scales Sd 
Hs —019 
D —327 
Hy —090 
Pd —061 
Mf —309 
Pa 010 
Pt —103 
Sc —091 
Ma 170 
Si —334 
A ~(49 
R — 299 


Cof L 
— 190 
— 388 
—068 
— 399 
—510 
— 134 
— 537 
— 503 
—125 
— 303 
— 523 

043 


001 
085 
203 
— 102 
—046 
052 
— 373 
— 262 
— 199 
—123 
—326 
366 


* For N = 100; an, of .257 is significant at the .01 level 


of the 105 control men who represent a stand- 
ard MMPI group comparable to those em- 
ployed by others. 

Table 5 shows a trend for the more accu- 
rate empirical measures of social desirability 
(Sd, Cof, L) to be only slightly correlated 
with the clinical scales, and for the less ac- 
curate measures (Sd-A, En, K) to be more 
highly correlated with the clinical scales. A 
convincing case can be made for the influ- 
ence of “social desirability” in the MMPI by 
using the Edwards scale (En) as a measure 
of this response style. However, if one were 
to employ an actual empirical measure of this 
tendency, such as Sd, the case is considerably 
weakened. Several of the correlations of Sd 
and Cof with the clinical scales suggest that 
social desirability is indeed a factor to be 
reckoned with in normal populations—even 
though its influence has been exaggerated in 
previous studies. 

It is also apparent, from Tables 4 and 5, 
that scales such as En, Sd-A, and K share a 
common element that runs throughout the 
MMPI clinical scales. Table 3 strongly sug- 
gests that this element is, in part, some form 
of “response bias.” There are now a con- 
siderable number of studies (see Jackson & 
Messick, 1958) which suggest that this “re- 
sponse bias” is related to the primary factors 
measured by the MMPI and that it may 
well reflect valid (criterion-relevant) variance 


Sd-A 


Ex Sd-R 
— 388 
—412 

140 


— 496 
— 393 
—034 
— 486 
— 366 
— 287 
— 807 
—753 
— 394 
—575 
— $94 

275 


— 266 
386 
169 
242 
292 
122 
373 
350 

—()28 

— 463 

— 242 

—516 


rather than “error” in assessment. In any 
event, it seems clear from the present study 
that the various “stylistic” components of in- 
ventory responses must be carefully separated 
and that recourse to empirical procedures is 
a requisite to eventual clarification of their 
role in personality inventories, such as the 
MMPI. 


Summary 

The empirical validity of several existing 
measures of social desirability in the MMPI 
was investigated by the method of contrasted 
groups. A group of 250 college men and 
women were instructed to answer the MMPI 
in terms of the social desirability of the items. 
A different group of 190 college men and 
women received standard MMPI instructions. 

Although two empirical measures of social 
desirability were relatively effective in identi- 
fying dissemblers, certain rational social de- 
sirability scales, and especially that of Ed- 
wards (SD), were found to be poor predictors 
of the criterion. The rational social desirabil- 
ity scales which were ineffective in predicting 
dissimulation were found to be substantially 
correlated with an independent measure of re- 
sponse bias. It was suggested that previous 
reports of the influence of social desirability 
on the MMPI clinical scales would be more 
appropriately considered as studies of re- 
sponse bias. It was emphasized that these 
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two “stylistic’ components should be differ- 
entiated and that empirical methods are the 
methods of choice in such an enterprise. 


Received August 28, 1958. 
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INTELLIGENCE TEST PERFORMANCE AND THE 
DELAY FUNCTION OF THE EGO 


GEORGE SPIVACK, MURRAY LEVINE, ann HERBERT SPRIGLE 


Devereux Foundation, Devon, Pennsylvania 


Many studies have shown low, but signifi- 
cant, correlations between Rorschach human 
movement (M) responses and general intelli- 
gence (Levine, Spivack, & Wight, 1959). The 
M response has been conceptualized as both 
a product and a measure of the delay func- 
tion of the ego, or of inhibition ability (Singer, 
1955; Singer, Wilensky, & McCraven, 1956). 
Since intelligence test performance may be 
conceptualized in terms of ego function 
(Fromm, Hartman, & Marschak, 1954; Rapa- 
port, Gill, & Schafer, 1945), the question 
arises whether it is possible to relate more 
specific measures of delay, not only to gen- 
eral intelligence but, also, to details of intelli- 
gence test performance itself. 

Our conception is similar to the position 
detailed by Rapaport (1951) in which the 
“apparatuses” of cognition develop in child- 
hood as a function of inevitable delay in 
gratification, and that the further develop- 
ment of thinking (memory, fantasy, abstrac- 
tion, problem solving, and planfulness) sup- 
ports increased delay or inhibition of the 
expression of impulses in consideration for re- 
ality demands. With increased age, thinking 
becomes increasingly a substitute for direct, 
impulsive action and can serve as a partial 
discharge of tensions. 

Since the many aspects of thinking are in 
a sense equivalent to what we test: when we 
test intelligence, it should be possible to dem- 
onstrate a relationship between general intel- 
ligence and measures of ability to delay which 
bear no obvious similarity to the kinds of 
tasks employed in intelligence tests. More- 
over, if this conception is valid, it should be 
possible to isolate instances in intelligence 
test performance wherein the ability to inhibit 
or delay plays an integral part in success or 
failure. 


Method 
Experimental Tasks 


Time estimation. This task required that 
the subject (S) tell E when a 15”, 30”, and 
60” time interval had elapsed. Two estimates 
at each time were obtained in random order. 
The scores used were the sums of both esti- 
mates for each of the three time intervals. 
Higher scores are considered indication of 
better delay functioning. 

This task was selected since a prior study 
with a similar population had shown the time 
sense to be related to the ability to delay 
impulse gratification by responding to dis- 
tant incentives (Levine & Spivack, in press). 
Singer and Opler (1956) and Singer, Wilen- 
sky, and McCraven (1956), working with 
adult schizophrenics, have shown that accu- 
racy of performance on this time estimation 
task is related to other measures of inhibition 
ability. 

Stroop color-word test. This test, as adapted 
for the present study, required S: (a) to read 
aloud as quickly as possible four color names 
(red, green, blue, and yellow) which are 
printed on a white card in black ink a total 
of 50 times (B/W card); (6) to name as 
quickly as possible the color of the ink in 
which 50 color names were printed where the 
color name and the ink hue are incongruent 
(C/W card). Faster reading times are in- 
dicative of greater ability to control or in- 
hibit a strong interfering habit in performing 
this task. 

Previous investigators (Klein & Salomon, 
1952; Lazarus, Baker, Broverman, & Mayer, 
1957) have accepted performance on this task 
as a measure of an ability to regulate com- 
peting responses or to resist the effects of in- 
terfering response tendencies. This task was 
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employed in the present study due to its con- 
ceptual similarity to a previously employed 
measure of “cognitive inhibition” (Meltzoff & 
Levine, 1954; Levine & Meltzoff, 1956). 

Rorschach M tendency. The series of 26 
Barron M-threshold inkblots (Barron, 1955) 
was employed, with instructions for one re- 
sponse per card. The score used was the total 
number of M responses produced. 

The M tendency was employed because it 
has been shown to be related both theoreti- 
cally and experimentally to ego delay func- 
tioning, and has been the central operational 
measure in much of the research in this area 
(Singer, 1955; Singer, Wilensky, & McCraven, 
1956). 

General intelligence. General intelligence 
was measured by the Wechsler-Bellevue in- 
telligence scales. These measures were ob- 
tained from the clinical records of the insti- 
tution and on the average the IQ tests were 
administered 15 months prior to the other 
experimental procedures. 


Subjects 

The population consisted of 123 emotion- 
ally disturbed adolescents in residential treat- 
ment. There were 86 males and 37 females in 
the group. All diagnoses were represented, ex- 
cept for the psychoses and chronic brain syn- 


drome. The average age was 16 years, with a 
range of 12 to 19. The mean IQ was 101, 
with a range of 70 to 140. The distribution 
of IQs was essentially normal, and the stand- 
ard deviation was 14.6. 


Procedure 


Each S was seen individually and was pre- 
sented with the battery of tests as “an at- 
tempt to develop some new tests.” The Ss 
were assured that the present test results were 
purely of an experimental nature and would 
not be entered into their clinical records. The 
Ss were generally cooperative, and the tasks 
seemed readily understandable. 


Results 


The intercorrelations of the experimental 
measures are presented in Table 1. The sig- 
nificant correlations between all three meas- 
ures and IQ support the hypothesis that meas- 
ures of ego delay are related to general intel- 
ligence. 

In order to obtain a clearer picture of the 
meaningfulness of these correlations in terms 
of IQ points, the distributions of each of the 
three delay measures were split at the median, 
and mean IQs for the upper and lower half of 
each were compared. Only the distribution of 
the 60” time estimations was employed, pri 
marily because it correlated best with the 
other variables. The ¢ tests for IQ differences 
in the three measures considered individually 
are all significant, as would be expected from 
the significant correlations. Based on any 
single measure, the good and poor “inhibitors” 
differ on the average from five to eight IQ 
points. When we selected those Ss who were 
good and poor on all three delay measures, 
we found a mean difference between groups 
of 17.2 IQ points. The difference is signifi- 
cant, by ¢ test, well below the usually ac- 
cepted probability levels 


Table 1 


Intercorrelations* Among IQ and the Measures of Ego Delay 


Variable 


1. Time Estimation 15” 
2. Time Estimation 30” 
3. Time Estimation 60” 
4. Stroop C/W? reading 
5. Barron M total 

6. Wechsler IQ 


* For an N 
> All correlations involving C/W are partial rs 
control for general reading speed 


in whic 


of 123, correlations of .19 are significant at the .05 level (tw 
, , 


»-tail test) 


ading time for the black 
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Since the tasks were taken as different 


measures of the delay function of the ego, + 


the interrelationship among them becomes 
important. It can be noted in Table 1 that 
the intercorrelations are significant, with the 
exception of those between time estimations 
and M. 

In order to relate the three ego delay meas- 
ures to intelligence test performance itself, 
we examined the relationship between our 
three measures and scores on the digit span 
subtest of the Wechsler. This particular sub- 
test was chosen on the basis of Wechsler’s dis- 
cussion pf it as a test of concentration and 
ability to attend (Wechsler, 1944, p. 84), and 
our own evaluation of this task as requiring 
in part the ability to control or inhibit com- 
peting or extraneous thoughts and stimuli. 
Digit span total raw scores were related to 
each of the three measures of ego delay, 
controlling for level of general intelligence 
through the use of the roving median chi- 
square test (Cronbach, 1949). The relation- 
ship between digit span and the C/W task 
approached significance (,y° = 3.5; p = .06) 
as did the relationship with time estimation 
(x? = 3.0; p=.08). The relationship be- 
tween digit span and M was not significant. 


Discussion 


These findings provide further support for 
the hypothesis that measures of an ego delay 
function are related to general intelligence. 
While the correlations between IQ and the 
specific measures tend to be low, when we 
select individuals who are good and poor 
inhibitors consistently, the mean IQ difference 
(17.2 points) is rather substantial. It would 
appear that as we develop more reliable and 
more precisely defined measures of delay 
functioning, it should be possible to increase 
the correlation substantially. The low corre- 
lations among the different measures, and the 
fact that time estimation and M do not cor- 
relate, suggest that there may very well be 
more than one important style of control 
(Singer, Wilensky, & McCraven, 1956). The 
fact that M does not correlate with time esti- 
mation is consistent with other recent find- 
ings which suggest that the M response in 
childhood and adolescence does not have the 
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same interpretive value as in adults (Levine, 
Spivack, & Wight, 1959; Litwin, 1957). 
One reservation should be kept in mind in 
that the present population was an emotion- 
ally disturbed one. Impulsivity is character- 
istic of many of the adolescents we used, and 
the possibility exists that in a normal group 
the relationship under investigation might 
differ either in degree or kind. It is also pos- 
sible that the same study in a normal group 
would require more sensitive measures, since 
impulsivity might manifest itself more subtly 
and less pervasively in normals. 

The finding with digit span indicates that 
in adolescents not only are delay mechanisms 
related to general level of intelligence, but 
that the specific operation of such mechanisms 
may be identified in the details of intelligence 
test performance itself. This finding, along 
with those of prior studies with adults (Le- 
vine, Glass, & Meltzoff, 1957; Levine, Spi- 
vack, & Wight, 1959) indicate that it may 
eventually be feasible to discuss intelligent 
“behavior” within the general framework of 
ego psychology. The task remains to explore 
further various relationships between meas- 
ures of control and qualitative aspects of in- 
telligence test performance. An area worth ex- 
ploring is that of the nonintellective aspects of 
intelligence test performance. An individual 
with poor delaying ability may not perse- 
vere when the difficulty of the task arouses 
anxiety or it becomes apparent that a quick 
solution is not forthcoming. He might char- 
acteristically respond with the first thought 
that comes to mind, whereas holding him- 
self in check and reflecting a moment might 
permit him to develop a response which 
would score two points instead of one. The 
possibility also exists that in individuals with 
good delay functioning, a positive value is 
placed on thinking and “intellect’—thinking 
functioning for such individuals as both a di- 
rect and indirect gratification. For such indi- 
viduals, manipulating abstract concepts, de- 
veloping vocabulary, and the idea of finding 
the answer may be gratifying. Such qualities 
are certainly assets in taking an intelligence 
test. A more detailed analysis of the attitudes 
and approach of the subject throughout the 
test might reveal more clearly the relation- 
ships between personality and _ intelligence. 
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Once such relationships have been clearly 
shown, it may be possible to develop a more 
unified theory in which intelligence and per- 
sonality variables may be discussed in the 
same terms. 


Summary 


The purpose of the present study was to 
demonstrate a relationship between three 
measures of ego delay function, general intel- 
lectual level, and performance on the digit 
span subtest of the Wechsler-Bellevue (1944). 
The results support the hypothesis that ego 
delay function, as measured by M, time esti- 
mation, and the Stroop test, is correlated with 
IQ. Performance on the digit span subtest was 
shown to be related to time estimation and to 
the Stroop (1935) test independently of gen- 
eral intellectual level. The results suggest the 
feasibility of treating intelligence test per- 
formance and personality variables within a 
single theoretical framework. 


Received August 28, 1958. 
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Since symptoms which resemble mental de- 
ficiency often result from the schizophrenic 
process, it is sometimes difficult to distinguish 
patients who are schizophrenic but not men- 
tally defective from those who are both. The 
resultant errors may be deleterious to the non- 
defective schizophrenic, in that they inhibit 
appropriate therapy. 

The present paper reports a study of the 
characteristics of a number of cases in which, 
after a psychiatric interview, a diagnosis of 
mental deficiency was erroneously made in 
addition to the finding of schizophrenia. 

An inspection of the records at a neuropsy- 
chiatric hospital revealed that 41 out of ap- 
proximately 2,400 patients bore a diagnosis 
of both mental deficiency and psychosis. 

An attempt was made to test each of the 
41 with the Wechsler-Bellevue Intelligence 
Scale, Form I. Twenty-nine were testable, 
and 17 made IQ scores of 70 or above, de- 
spite the fact that there were clear indications 
in most of their protocols of psychotic disrup- 
tion of test performance. This group of 17 pa- 
tients was chosen for further study. 

The mean IQ score for the 17 patients was 
85.9. The range was 71 to 103. Everyday 
clinical experience and considerable research 
indicate that chronic schizophrenia reduces 
functioning IQ. While it is difficult to obtain 
an exact estimate of the amount of reduction, 
reports of previous findings (Gilliland, Witt- 
man, & Goldman, 1943; Hunt & Cofer, 1944) 
indicate that an estimate of 15 points is not 
unreasonable. It would appear, therefore, that 
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despite their diagnosis, the 17 patients were, 
as a group, in the average range of intellectual 
endowment. The mean age of these patients 
was 56.9 years. Eight of them were over 59 
years of age, the maximum age for which 
Wechsler provides IQ tables. For these eight 
patients, the weighted score-IQ equivalents 
for ages 55-59 were used. This procedure is 
conservative, for purposes of this study, since 
it slightly underestimates the IQs. 

The patients had been diagnosed on the 
basis of a psychiatric interview soon after ad- 
mission. The 17 diagnoses had been made by 
nine different psychiatrists. 

The main purpose of this investigation was 
to determine how these 17 schizophrenic pa- 
tients, who were misdiagnosed as mental de- 
fective as well as psychotic, differed from 
other patients who were diagnosed only as 
psychotic. 

In order to make the comparison, a 26-pa- 
tient control group, chosen from the same 
wards, was selected so that they had the same 
mean year of birth (1896) and mean year of 
admission (1940) as the experimental group. 
For each of these measures the variances were 
almost identical for the two groups. 

The experimental and control groups were 
compared with respect to educational and oc- 
cupational level, two social history variables 
which were thought to be of possible impor- 
tance. The control group was found to have a 
slightly higher mean educational level (3.2 
years) than the experimental group (6.7 
years), but the difference was not statisti- 
cally significant (p = .36). 

The highest occupational level attained by 
each S was scored, using the median Army 
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General Classification Test score for World 
War II inductees as reported by Stewart 
(1947). The purpose of this comparison was 
to determine whether a patient was likely to 
be diagnosed as mentally defective because 
his occupation was one usually held by peo- 
ple of low intelligence. The mean A.G.C.T. 
score corresponding to the occupations was 
95.6 for the experimental group and 105.9 
for the controls. The difference was significant 
at the .01 level. 

The two groups were next compared on 
diagnosis aside from the finding of mental de- 
ficiency. Although both groups consisted en- 
tirely of patients described as functional psy- 
chotics, they were found to differ markedly 
in the number of patients who were labeled 
in terms of the classic subtypes of schizo- 
phrenia. Of the experimental group, less than 
one-third were diagnosed either paranoid, 
hebephrenic, catatonic, or simple, while all of 
the control patients received such a diagnosis. 
Over two-thirds of the experimental patients 
received a diagnosis of “psychosis with men- 
tal deficiency,” or some similar label. In none 
of the cases was there mentioned a variety of 
psychosis other than schizophrenia. This ob- 
servation raised the question of whether the 
experimental patients may have received their 
erroneous diagnosis in part because they 
lacked those distinctive schizophrenic symp- 
toms which make it possible to diagnose the 
patient in terms of classic subtypes. 

It therefore seemed desirable to compare 
the two groups of patients on their sympto- 
matology at the time of intake. The psychia- 
trist’s description of the patient in the intake 
interview was copied from each patient’s 
folder. These descriptions were found to be 
of about equal length for the two groups, 
showing a mean of 33 lines for the experi- 
mental group and 32 lines for the control 
group. Using these descriptions, three psy- 
chologists rated each patient on the Multi- 
dimensional Scale for Rating Psychiatric Pa- 
tients (Lorr, 1953), using only those 42 items 
which were designed for use by an interview- 
ing psychiatrist. 

The three sets of ratings were compared 
for interrater reliability by means of an intra- 
class correlation coefficient (Haggard, 1958). 
It was found that the ratings were reliable 
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above chance (p < .05) on all except two 
items (Nos. 14 and 26). These two items 
were eliminated from further computations. 
The Multidimensional scale allows the rater 
to mark an item as undecided. Whenever two 
of the three raters marked an item “unde- 
cided” for a given patient the item was 
dropped for that patient, but if only one 
judge marked it undecided, the ratings of the 
other two judges were used, and a mean rat- 
ing was obtained for each item on each S. 
The items of the Multidimensional scale are 
grouped by the test authors into 12 symptom 
clusters or “factors,” 8 of which are com- 
posed primarily of items from the psychiatric 
rating portion of the scale. The two groups 
of patients were compared on their mean pa- 
thology rating on each of these eight clusters 
The two clusters labeled by the test authors 
as conceptual disorganization and perceptual 
distortion were found to discriminate the two 
groups of patients, and both were in the pre- 
dicted direction. That is, the experimental 
group was rated as significantly less patho- 
logical than the control group, the differ- 
ence being significant at the .05 level, com- 
puted by a two-tailed Mann-Whitney test. The 
symptom clusters on which the two groups 
did not differ significantly were those labeled 
paranoid projection, melancholy agitation, 
motor disturbances, hysteric conversion, self- 
depreciation vs. grandiose expansiveness, and 
retarded depression vs. manic excitement. 
The two symptom clusters which distin- 
guished the groups might be thought to be 
those which interviewers would consider most 
clearly indicative of psychosis. To verify this 
impression, the eight cluster labels, together 
with the designations of their several respec- 
tive item dimensions, were submitted to 13 
psychiatrists to rank order on the extent to 
which the symptom clusters are important for 
distinguishing psychosis from nonpsychosis. 
The two predicted symptom clusters were 
rated as the most important. The mean rank- 
ings for perceptual distortion and conceptual 
disorganization were 1.8 and 2.5, respectively 
while the mean rankings of the other six fac 
tors were, in the order listed above, 4.4, 6.1 
3.6, 7.5, 5.7, and 4.5. 
It appears, then, that those patients bear 


ing the additional diagnosis of mental de 
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ficiency showed fewer symptoms that are 
usually thought of as identifiably psychotic. 
Although the interviewing psychiatrists saw 
sufficient signs of psychosis to diagnose the 
patients as such, they evidently responded to 
the paucity of distinctively psychotic symp- 
toms by adding the diagnosis of mental de- 
ficiency. 


Summary 


Seventeen schizophrenic patients who, on 
the basis of an interview, were misdiagnosed 
as mentally defective in addition to their diag- 
nosis as psychotic were compared with 26 
controls who were diagnosed only as schizo- 
phrenic. The experimental group was found 
to have attained a lower occupational level 
than the controls, and was reported to have 
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exhibited less distinctively psychotic symp- 
toms in diagnostic interviews. 
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The major focus of research in psycho- 
therapy has been on the evaluation of its 
effects. Comparatively little has been done to 
study the qualities of psychotherapists which 
may contribute to the success of psychother- 
apy. Most theoreticians and therapists agree 
that “the psychology of the psychologist . . . 
enters into the determination of the thera- 
peutic product” (Rogers, 1949; Shoben, 1949, 
p. 367). Research has shown that people 
rated as better therapists are seen by their 
peers, colleagues, and supervisors to be hap- 
pier, less anxious, better adjusted, and more 
likeable persons than are those rated as poorer 
therapists (Bandura, 1956; Kelley & Fiske, 
1951; Luborsky, 1952). 

Almost all schools of psychotherapy accept 
the thesis that good psychotherapy involves 
a good interpersonal relationship in which the 
patient or client can feel accepted, and in 
which the therapist must be an accepting 
person. Research findings have tended to sup- 
port the hypothesis that better, more expert 
therapists are more accepting than are poorer 
therapists. This conclusion has been deduced 
from indirect evidence (Fiedler, 1950a; Fied- 
ler, 1950b; Strupp, 1957), or from colleagues’, 
peers’, or supervisors’ evaluations of training 
therapists (Kelley & Fiske, 1951; Luborsky, 
1952). No investigations, however, have di- 
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rectly tested the widely held proposition that 
better therapists are more accepting persons. 

Since theory has suggested (Fromm-Reich- 
man, 1949; Rogers, 1951) and research has 
demonstrated (Berger, 1952; Sheerer, 1949; 
Stock, 1949) that a person’s ability to accept 
others is positively related to his feelings of 
self-acceptance, these attitudes might well be 
investigated together. 

The purpose of the present study was to 
make a direct test of the hypothesis that bet- 
ter psychotherapists are more accepting of 
others and more self-accepting than are poorer 
psychotherapists. The measures of the psy- 
chotherapists’ attitudes are in terms of ex- 
pressed acceptance of others and expressed 
self-acceptance. 


Hypotheses 


It was hypothesized that therapists who 
are more accepting of the self and others are 
better therapists, as rated by their super- 
visors. They also rate their psychotherapeutic 
ability more accurately, that is, in better 
agreement with their supervisors’ ratings. It 
was also hypothesized that, since the more 
accepting therapists rate themselves more ac- 
curately, the relationship between the self- 
ratings and the degree of acceptance is greater 
for the more accepting therapists than for the 
less accepting therapists. 

Corollary hypotheses were that the super- 
visors and the subjects, respectively, are in- 
ternally consistent. For both the supervisors 
and the subjects, ratings of case movement 
and of psychotherapeutic ability are posi- 
tively related. A final corollary hypothesis 
was that expressed acceptance of self is posi- 
tively related 
others. 


to expressed acceptance of 
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Subjects 


Two groups of Ss were used in the experi- 
ment. The Ss were third- and fourth-year 
doctoral candidates in clinical, counseling, 
and school psychology, most of whom had 
completed a year’s internship. 

Group 1 consisted of 28 graduate students 
who were enrolled in a practicum in psycho- 
logical counseling at Teachers College, Co- 
lumbia University, during the academic year 
1956-1957. Group 2 consisted of 51 graduate 
students or graduates who had been in the 
practicum during the academic years 1953- 
1954, 1954-1955, 1955-1956, and who re- 
sponded to the request for data. Of 76 for- 
mer students from whom the data was re- 
quested, 51 or 67% responded. 

The practicum runs for an entire academic 
year. Students see clients in a clinic, integral 
with the department, under intensive super- 
vision of professional members of the staff, in 
supervisory groups of three or four. 


Procedure 


The procedure involved the use of scales of 
expressed acceptance of others and expressed 


self-acceptance which were constructed by 
Berger (1952) and rating scales of psycho- 
therapeutic ability and of case movement de- 
vised specifically for the present experiment. 


Measures Obtained 


Six measures were obtained from the Ss and 
the supervisors. 

The two measures obtained from the su- 
pervisors were their ratings of the psycho- 
therapeutic ability of the Ss in their past and 
current supervisory groups, and their ratings 
of the therapeutic movement of each client 
seen in counseling by each S in their current 
supervisory groups, Group 1. A total of 91 
clients was seen by all of the Ss. The Ss saw 
from two to five clients each. 

Three measures were obtained from all of 
the Ss; namely, ratings of their own psycho- 
therapeutic ability and expressed acceptance 
of others (AO) and expressed self-acceptance 
(SA), as measured by Berger’s (1952) scales. 
In addition, the Group 1 Ss rated the thera- 
peutic movement of each of their clients. 
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Ratings of the movement of the counseling 
cases were obtained for Group 1 Ss only, since 
their clients had been most recently seen and 
were, therefore, more likely to be rated accu- 
rately by both the Ss and the supervisors. 


Rating Scale of Psychotherapeutic Ability 


The rating scale of over-all psychothera- 
peutic ability was as follows: 


6—One of the best 
5—Considerably above average 
4—Somewhat above average 
3—Somewhat below average 

Considerably below average 
1—One of the poorest 


07% 
18% 
25% 
25% 
18% 
07% 

The supervisors were asked to rate the Ss 
in the six categories, using as a reference 
group all students they had known at a simi- 
lar level of training. Further, they were asked 
to select three individual students whom they 
considered “best,” “average,” and “poorest” 
in psychotherapeutic ability, and to think 
of these three students as specific reference 
points in making the ratings. 

The Ss were asked to rate themselves in the 
above categories, using as a reference group 
all the students they had known at a similar 
level of training. 


Rating Scale for the Movement of Cases 


The rating scale for the movement of each 
case contained two steps, as shown: 


Step Step 
I II 
3—Above average 
6—Great progress 
5—Considerable 
progress 


-Above average 
3—Below average 
1 Below 


average 
2—Very little progress 
1—No or almost no 
progress 07 

The supervisors and Ss were individually 
asked to rate the movements of the cases first 
in the categories under Step I, and then with 
the scores under Step II. They were asked to 
use as a reference group the movement of all 
clients or patients in counseling or psycho- 
therapy with similarly trained students they 
have known. 
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Berger’s Scales 


Data on the development, reliability, and 
validity of Berger’s scale for expressed ac- 
ceptance of others and expressed self-accept- 
ance were published by him (1952). Since 
the scales reflect expressed feelings of accept- 
ance, a basic and realistic acceptance of self 
and others may not necessarily be revealed. 
Berger expressed the belief that “Immature 
and unrealistic persons may score very high 
on SA & AO.” ® 

In using the Berger scales in the present 
study, it was assumed that most graduate 
students and graduates of a clinical, counsel- 
ing, or school psychology training program 
would be sufficiently mature, realistic, and 
secure so that a basic and realistic accept- 
ance of the self and others may be inferred 
from the scores they achieve on the SA and 
AO scales. 

Since Berger’s scales were validated under 
conditions of anonymity of Ss, a procedure 
was devised, in the administration of the 
scales for the present study, to maintain 
similar conditions. The responses and ratings 
of the Ss were kept unidentifiable to the su- 
pervisors and the experimenter. The Ss were 
so informed in advance. 

To determine the variability and reliability 
of the Berger scales with the population used 
in the present study, a pilot study was con- 
ducted in which the scales were administered 
to the Group 1 Ss and the results compared 
with those obtained by Berger (1952). A 
comparison of the pilot study group with 
Berger’s seven groups indicated an adequate 
spread of scores on both Berger scales. Split- 
half reliabilities of .79 and .99 were found 
for the AO and SA scales, respectively, indi- 
cating adequate reliability for use in the pres- 
ent study. 


Results and Conclusions 


All the hypotheses were evaluated with 
Pearson product-moment correlations, and 
tested for significance with the ¢ test, except 
where otherwise indicated. In each case the 
probability was estimated on the basis of a 
two-tailed test. The mean, standard devia- 


8 E. M. Berger. Personal communication. February 
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Table 1 
Means, Standard Deviations, and Ranges of 
Scales Used in the Present Study 
Scale 


Mean SD Range 


Expressed acceptance ol 
others 


I xpressed self-acc eptance 


Ratings of psychothera 
peutic ability by super 
visors 


Ratings of psy¢ hothera 
peutic ability by Ss 
Ratings of psychothera 
peutic movement of cli 
ents by supervisors 
Ratings of psychothera 
peutic movement of cli 

ents by Ss 


tion, and range for each scale are presented 


in Table 1. 
Main Hypotheses 


The hypothesis that therapists who are 
more accepting of the self and others are bet- 
ter therapists, as rated by their supervisors, 
was tested by correlating supervisors’ ratings 
of Ss’ psychotherapeutic ability 
scores on Berger’s SA and AO scales. Non- 
significant correlations of —.02 and .09, re- 
spectively, were obtained, which failed to 
support the hypothesis. Correlations of su- 
pervisors’ ratings of Ss’ psychotherapeutic 
ability with Ss’ scores on Berger’s scales were 
also obtained for the Group 1 and Group 2 
Ss separately. None of the correlations was 
significant. 

The hypothesis that the more accepting 
therapists rate their psychotherapeutic ability 
more accurately, that is, in better agreement 
with their supervisors’ ratings, was tested by 
correlating Ss’ scores on the SA and AO scales 
with the differences between the Ss’ self-rat- 
ings and the supervisors’ ratings of them. A 
significant negative correlation would confirm 
the hypothesis. Two measures of the differ- 
ences between the supervisors’ ratings of Ss 
and Ss’ self-ratings were obtained. One meas- 
ure was the differences without signs, the ab- 
solute differences; the other measure was the 


with Ss’ 
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algebraic differences. To avoid the use of 
negative numbers, a constant, 10, was added 
to each algebraic difference. 

None of the correlations between the scores 
on the acceptance scales and the differences, 
both absolute and algebraic, between super- 
visors’ and self-ratings were significantly dif- 
ferent from zero. Nonsignificant correlations 
also resulted for Ss who scored in the upper 
and lower 50% on the AO scale and on the 
SA scale. As the Ss’ scores on Berger’s scales 
are independent of the differences between 
Ss’ self-ratings and supervisors’ ratings of. Ss’ 
psychotherapeutic ability, the hypothesis was 
not supported. 

The hypothesis that the relationship be- 
tween the self-ratings and the degree of ac- 
ceptance is greater for the more accepting 
therapists than for the less accepting thera- 
pists was tested by correlating the Ss’ scores 
on Berger’s scales with Ss’ ratings of their 
own psychotherapeutic ability for Ss scoring 
above and Ss scoring below the fiftieth per- 
centile on each of the Berger scales. The sig- 
nificance of the differences between the cor- 
relations of the upper and the lower 50‘; 
on each of the Berger scales was tested by 
the transformation of the correlations into z 
scores. 


Table 2 


Correlations of Subjects’ Scores on Berger’s Scales 
with Self-Ratings by Subjects 


Differences 
Between 
Correla 
tions of 
High and 


Low 50°; 


Correla- 
tions 
with 
Self- 

Berger's Scales Ratings 
Expressed acceptance 
of others 
High 50° 
Low 50°% 
All Ss 
Expressed self- 
acceptance 
High 50% 


Low 50% 
All Ss 


* Significant at™.0S_level 
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Table 3 


Correlations of Ratings of Case Movements by Super- 
visors and Subjects, with Ratings of Subjects 
(N = 28) by Supervisors and Subjects 


Ratings of Ss 


Ratings of Case 
Movements 


By Super 


visors By Ss 


By supervisors 
Highest rating per S 
Lowest rating per S 
Mean rating per S 

By subjects 
Highest rating per S 
Lowest rating per S 
Mean rating per S 


** Significant at .01 level. 

* Significant at .0S level. 

As shown in Table 2, the correlation of Ss’ 
self-ratings with Ss’ scores on Berger’s AO 
scale is positive for Ss scoring above the fif- 
tieth percentile on the AO scale, and nega- 
tive for the Ss scoring below the fiftieth per- 
centile. Though the difference between the 
two correlations is significant at the .05 level, 
neither correlation, by itself, is significantly 
different from zero. No significant difference 
was found between the correlations of Ss’ 
self-ratings and their scores on Berger’s SA 
scale for Ss scoring above the fiftieth per- 
centile and Ss scoring at and below the fif- 
tieth percentile on the SA scale. While there 
is a small but significant relationship of Ss’ 
self-ratings and their scores on the SA scale 
for all Ss, there is no such relationship for 
Ss scoring at and below the fiftieth percentile, 
or for Ss scoring above the fiftieth percentile 
on the SA scale. The hypothesis was partially 
supported. 


Corollary Hypotheses 


The corollary hypotheses, that for both the 
supervisors and the Ss, the ratings of case 
movement are positively related to ratings of 
therapeutic ability, were tested by correlating 
the highest, lowest, and mean ratings of the 
therapeutic movement of the clients seen by 
the Group 1 Ss with the respective super- 
visors’ and Ss’ ratings of the Ss’ psychothera- 
peutic ability. As indicated in Table 3, with 
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the sole exception of the lowest ratings of 
case movement by the Ss correlated with the 
Ss’ self-ratings, the hypotheses were supported 
by correlations significant at the .01 and .05 
levels. 

The final corollary hypothesis, stating that 
expressed acceptance of self is positively re- 
lated to expressed acceptance of others, was 
tested and supported by a correlation of .53 
between Ss’ scores on the two Berger scales, 
which is significant at the .01 level. 


Additional Findings 


As shown in Table 3, except for the posi- 
tive correlation, significant at the .05 level, of 
the mean rating of case movement by Ss and 
supervisors’ ratings of Ss’ psychotherapeutic 
ability, ratings of case movements by super- 
visors are independent of Ss’ self-ratings of 
psychotherapeutic ability, and Ss’ ratings of 
case movements are independent of supervi- 
sors’ ratings of Ss’ psychotherapeutic ability. 

There is a positive correlation of .56, sig- 
nificant at the .01 level, of supervisors’ and 
Ss’ ratings of case movement. 

Table 4 shows that there is a positive rela- 
tionship, significant at the .01 level, between 
supervisors’ ratings of Ss’ psychotherapeutic 
ability and Ss’ self-ratings. The positive rela- 
tionship exists for Ss scoring below the fif- 


Table 4 


Correlations of Supervisors’ Ratings of Subjects 
with Self-Ratings by Subjects 


Corre Differences 


lations Between 

with Correla 
Self tions of 

Ratings High and 


by Ss Low 50°, 


Subjects Rated by 
Supervisors 
High 50% of expressed 

acceptance of others 
Low 50% of expressed 
acceptance of others 
High 50% of expressed 
self-acceptance 
Low 50°% of expre ssed 
self-acceptance 


All Ss 


** Significant at .01 level 
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tieth percentile on the AO scale and for Ss 
scoring below the fiftieth percentile on the SA 
scale, but not for the Ss scoring above the 
fiftieth percentile on either of the scales. The 
difference between the correlations of low and 
high Ss on both scales, as tested by the trans- 
formation of the correlations into z scores 
was not significantly different from zero. 

Further analysis of the data revealed that 
of the 39 Ss who scored at and below the fif- 
tieth percentile on the AO scale, 28 Ss also 
scored at and below the fiftieth percentile on 
the SA scale. The correlations of the super- 
visors’ ratings of Ss’ psychotherapeutic abil- 
ity with Ss’ self-ratings were .50 for the 28 
Ss who scored at and below the fiftieth per- 
centile on both scales, .13 for the 11 Ss who 
scored low on only the AO scale, and .74 for 
the 11 Ss who scored low on only the SA scale. 
The correlation for the Ss who scored low on 
only the AO scale is not significantly different 
from zero. The other correlations are signifi- 
cant at the .01 level. 


Discussion 


The confirmation of the corollary hypothe- 
ses indicated at least an internal consistency 
by the supervisors and by the students, and 
some measure of concurrent validity of the 
instruments used in the study. 

The therapists’ self-ratings of their psycho- 
therapeutic ability are most closely related to 
their highest ratings of the psychotherapeutic 
movement of their clients. They are also re- 
lated to a lesser extent to their mean ratings 
of case movement, and are not at all related 
to their lowest ratings of client psychothera- 
peutic movement. The findings strongly sug 
gest that the therapists tended to perceive 
their own psychotherapeutic ability more in 
terms of their psychotherapeutic successes 
than in terms of their comparative failures. 


Expressed Acceptance of Self and Others, and 
Psychotherapeutic Ability 


The results of the present study failed to 
support the hypothesis that better psycho- 
therapists are more accepting of others or 
more self-accepting than are poorer psycho- 
therapists. They indicate that, among selected 
graduate students in clinical, counseling, and 
school psychology training programs, there is 





440 


no relationship between expressed acceptance 
of self or others and judged psychotherapeutic 
ability. 

The one positive finding was that the cor- 
relation of therapists’ self-ratings of their 
therapeutic ability with their expressed ac- 
ceptance of others was positive for the thera- 
pists expressing more acceptance of others, 
and negative for the therapists having less 
expressed acceptance of others, the difference 
being significant at the .05 level. The finding 
may be due to chance since neither of the 
correlations was, in itself, significantly dif- 
ferent from zero. 

An additional finding suggesting further ex- 
amination is that therapists having compara- 
tively low expressed acceptance of self and 
others and those having only comparatively 
low expressed self-acceptance perceived their 
psychotherapeutic ability more as their su- 
pervisors evaluated them than did the thera- 
pists with greater expressed acceptance of self 
and others and even those with only lower 
expressed acceptance of others. The findings 
suggest that the greater degree of similarity 
may be related to low self-acceptance rather 
than to high self-acceptance as postulated. 


Perhaps the less self-accepting therapists are 
less secure and do not form as independent 


judgments as do the more self-accepting 
therapists. They might seek evaluation from 
their supervisors more frequently and be more 
sensitive to their supervisors’ opinions than 
the more self-accepting therapists. 

Another possible explanation for the find- 
ings that the low scorers on AO and SA rate 
their therapeutic ability more as their super- 
visors rate them than do the high scorers is 
that Berger’s idea that “Immature and un- 
realistic persons may score very high on SA 
and AO” is correct and applicable to the Ss 
in the present study contrary to the assump- 
tion made in the study. If so, the high scorers 
on the “acceptance” scales would be less re- 
alistic and therefore less “accurate” in their 
self-ratings. Similarly, Berger’s idea may ac- 
count for the positive correlation of self-rat- 
ings and AO scores for the high AO scorers, 
and the negative correlation for the low AO 
scorers. Unrealistically high self-ratings by the 
high scorers might account for the positive 
correlation. To the extent that the high 
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scorers on Berger’s scales are unrealistic, the 
conclusions based on the present study would 
be limited. 


Implications 


The findings suggest that, contrary to the 
common opinions among psychotherapists and 
theoreticians, better therapists are not neces- 
sarily more accepting people. If it is true that 
acceptance of the patient or client is neces- 
sary to good therapy, it may be that better 
psychotherapists are capable of being accept- 
ing of their patients or clients during the psy- 
chotherapeutic hours. They would not neces- 
sarily have to be accepting of all people, or 
people in general, all of the time. Fiedler and 
Senior (1952) suggest the possibility that 
therapists are able “to ‘soft pedal’ and/or 
disguise those aspects of their personalities 
which would be distressing to the patients, 
and emphasize, instead, those aspects of their 
personalities which are reassuring and inspir- 
ing” (p. 452). Rogers (1957) wrote that “It 
is not necessary . . . that the therapist be a 
paragon who exhibits this degree of integra- 
tion, of wholeness, in every aspect of his life. 
It is sufficient that he is accurately himself in 
this hour of this relationship . . .” (p. 97). 
Thus it may be necessary, in the investiga- 
tion of the contribution of the therapist’s 
personality to good psychotherapy, to study 
the therapist’s personality as it is manifested 
in the therapeutic situation. Notwithstanding 
this possible modification, this study does not 
support the hypothesis that general accept- 
ance of self or others is related to psycho- 
therapeutic ability. 


Summary 


A direct test was made of the hypothesis 
that better psychotherapists are more accept- 
ing of others, and more self-accepting, than 
are poorer psychotherapists. This belief is 
held by most psychotherapists and schools of 
psychotherapy. 

The 79 subjects were current and former 
students in the doctoral programs in clinical, 
counseling, and school psychology at Teach- 
ers College, Columbia University, who were 
taking, or had taken in the previous three 
years, an intensive required practicum in psy- 
chological counseling. 
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The criterion of psychotherapeutic ability 
was the ratings of the subjects’ supervisors 
in the practicum. Each supervisor supervised 
three or four students per semester. Thera- 
peutic movement of each subject’s clients was 
also rated. The acceptance variables were 
measured by Berger’s scales of Expressed 
Self-Acceptance and Expressed Acceptance of 
Others. 

The results failed to support the hypothe- 
sis. No relationship was found between psy- 
chotherapeutic ability, as rated by supervi- 
sors, and expressed acceptance of self or 
expressed acceptance of others. Ratings of 
therapeutic ability were found to be related 
to ratings of therapeutic movement of clients, 
and expressed acceptance of self was found to 
be related to expressed acceptance of others. 

Though there are limitations of the study 
in terms of the scales and subjects used, the 
results suggest that better therapists are not 
necessarily more accepting persons. If accept- 
ance is a necessary ingredient in the thera- 
peutic process, it may be that better thera- 
pists are more accepting in the specific situa- 
tions of therapy. 


Received September 8, 1958. 
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DISCRIMINATION OF FEMALE SCHIZOPHRENICS 
WITH CONFIGURAL ANALYSIS OF THE 
MMPI PROFILE 


WILLIAM J. EICHMAN ! 


Veterans Administration Hospital, Roanoke, Virginia 


Although the MMPI was originally de- 
signed to discriminate between various diag- 
nostic categories, early research reported dis- 
couraging results in this respect. These early 
studies were largely confined to ¢ tests be- 
tween groups on individual scales. Recently, 
more complex approaches have been used, 
among which is a configurational analysis of 
paired scores within a particular profile. Sul- 
livan and Welsh (1952), who developed the 
methodology, demonstrated some success with 
this method in discriminating ulcer patients 
from controls. Taulbee and Sisson (1957), 
using the same methodology, were able to 
discriminate schizophrenics from neurotics in 
a sample of male VA patients. In both stud- 
ies, cross-validation showed good reliability 
for the signs. 

The Taulbee-Sisson study with schizophren- 
ics appears to be more important in regard to 
practical application of the results. Neverthe- 
less, the use of their signs upon a sample of 
female VA patients yielded discouraging re- 
sults and led to the present study. It is not 
surprising to find different results in a dif- 
ferent hospital setting, particularly when pa- 
tients of the opposite sex are used as subjects. 
The question remains whether these results 
are due to different diagnostic practices or to 
real differences in performance between the 
SEXES. 

In addition to the above, the Taulbee-Sis- 
son study has limited utility. Their control 
group was composed of neurotic patients, 
making it necessary to screen out, by other 


1The author wishes to thank Hayden; L. Mees, 
VA Hospital, American Lake, Washington, Ralph 
Simon, VA Hospital, Perry Point, Maryland, and 
John Altrocchi, Duke University, for providing pro- 
tocols for cross-validation 


means, all character and personality disorders, 
organics, manic-depressives, etc. This presents 
an awesome problem, particularly in regard to 
the character and personality disorders, since 
these groups show symptomatology which ap- 
pears to be between that of the neurotic and 
schizophrenic groups, both qualitatively and 
quantitatively. The present study combines 
these nosological categories in the control 
group. Thus the problem is to develop a set 
of signs from the MMPI that will distinguish 
between two groups of female patients: (a) 
an unselected heterogeneous group of women 
diagnosed as neurosis, character or behavior 
disorder and (6) an unselected, heterogeneous 


group of women who had a primary diagnosis 
of schizophrenia. Considerable refinement of 


Sullivan-Welsh methodology can be made. 
Those refinements which were included in the 
present study are: (a) the use of the validat- 
ing scales in the development of signs; (6) 
the use of numerical difference scores rather 
than the simple “+” or “—” used in the two 
previous investigations; (c) cross-validation 
of signs in other hospital settings in addition 
to the one in which the signs were developed 
and; (d) application of weights to the signs 
which will maximize the discrimination be- 
tween groups. 

A secondary problem which was investi- 
gated concerned the use of the K scale as a 
suppressor variable. Since the introduction of 
this scale, it has been almost routine clinical 
practice to add a prescribed proportion of the 
K score into the scores of five of the clinical 
scales before converting raw scores into stand- 
ard scores. Although discrimination between 
diagnostic groups might be improved by such 
a practice, important statistical disadvantages 


$42 





Configural Analysis of the MMPI of Female Schizophrenics 


are introduced, i.e., experimental independ- 
ence is reduced. The authors (McKinley, 
Hathaway, & Meehl, 1948) of the K scale 
introduced these weights with the caution 
that other weights might be more practical in 
other situations. This problem has never been 
fully investigated. The approach used in this 
study was to develop signs from both the K 
weighted and the unweighted profiles, utiliz- 
ing that set of signs which offered more po- 
tential. 


Method 


Subjects 


The Ss used for the validation of signs in- 
cluded all female veterans in the Veterans 
Administration Hospital, Roanoke, Virginia, 
who had been administered the MMPI and 
who had received diagnoses fitting the cri- 
teria described below. Approximately half 
these Ss had been administered the test on 
admission as part of a battery, and the test 
results had some influence on the diagnosis. 
The remaining half of the Ss were adminis- 
tered the test for different purposes, e.g., pre- 
psychotherapy evaluation, evaluation of cur- 
rent status, etc., and the test had no influence 
on diagnosis, e.g., the test may have been ad- 
ministered a year after diagnosis had been as- 
signed. This latter group, however, included 
many chronic patients whose symptomatic 
picture had changed and who would not have 
received the same diagnosis were they to be 
re-evaluated. 


Control Group 


Thirty-three patients were found who were 
diagnosed neurosis, personality or character 
disorders. The percentage of patients in each 
group was: (a) neurosis 62%, (6) character 
or personality disorders 38%. The average 
age of the group was 37.0, with a standard 
deviation of 10.1 years. 


Schizophrenic Group 


Fifty-six patients were found who had a 
diagnosis of schizophrenia. These broke down 
according to the following categories: (a) 
chronic, undifferentiated, 52%; (6) paranoid, 
35%; (c) catatonic, 4%; (d) schizo-affec- 
tive, 4%; (e) hebephrenic, 4%; and (f) 
simple, 2%. 
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The average age of the group was 33.6, with 
a standard deviation of 6.6 years. Many of 
these patients showed little or no overt psy- 
chotic symptomatology. Initially, this study 
had been planned to use two schizophrenic 
groups, one with overt psychotic symptoma- 
tology and one with subtle psychotic symp- 
tomatology. The judgment of “subtle” or 
“overt” was made on the basis of whether 
one or more of the following existed at the 
time of the test administration: (a) delusions, 
(5) hallucinations, (c) gross confusion of 
thought processes, (d) gross loss of control 
over behavior, (e) profound withdrawal, (f) 
other bizarre behavior. Of the 56 patients 
who comprised the total schizophrenic group, 
28 were assigned to each group. Thus half 
the validation group showed no immediate 
signs of schizophrenic psychosis at the time 
of administration of the MMPI. Results are 
reported for each of these groups separately. 


Sign Analysis 


Seventy-eight scores were obtained for each 
patient. Twelve of these were the 7 scores on 
the 12 scales used in the study. The remain- 
ing 66 were the difference between JT scores 
for pairs of scales, using all combinations of 
the 12 scales, eg., L—K, L—F, L—UHs, 

.. L—Ma,K-—F,K—-—Hs,...K—Ma 

. . Sc — Ma. This procedure was followed 
for both the K weighted and the unweighted 
profiles. All scores which reached the .05 level 
of significance with the use of the F ratio 
were retained as signs for the scale. 

After the significant signs were determined, 
a cutting score for schizophrenia was arbi- 
trarily set at the bottom of the class interval 
in which the mean score for schizophrenics 
fell. (To simplify the computations, scores 
were grouped in intervals of five 7 score 
points.) Thus a plus score would be assigned 
to somewhat more than 50% of the schizo- 
phrenic group on each of the significant signs 
and to a lesser proportion of the control 
group. 


Results 
Analysis of the K weighted profiles resulted 
in 23 difference scores and 4 single scale 
scores which significantly differentiated the 
schizophrenic from the control group. Identi- 
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Table 1 
MMPI Scale Scores for Schizophrenic and Control Subjects (K Corrected) 


Group 


K F D 
56.4 54.2 62.3 
53.5 60.9 69.8 


Schiz. 
Control 


cal analysis of the unweighted profile resulted 
in a slightly greater number of significant 
signs but yielded less satisfactory discrimina- 
tion between groups, both in the validation 
and cross-validation samples. Consequently, 
further data and discussion is limited to the 
MMPI profile with the K weights applied. 
These mean scores are presented in Table 1. 

The significant signs with the appropriate 
cutting score for the scoring in the schizo- 
phrenic direction are presented in Table 2. 

Scoring of each profile led to the following 
results: The schizophrenic group had a mean 
score of 15.8 and the control group had a 
mean score of 9.3. This yielded a highly sig- 
nificant F ratio of 18.54 for 1, 87 df. Apply- 
ing a cutting score midway between the mean 
scores resulted in 70% correct identification 
of schizophrenic Ss and 64% correct identifi- 
cation of control Ss. 


Table 2 


Significant Signs for Differentiating Female Schizo- 
phrenics from Controls Arranged in Categories 
Used for Discriminant Function Analysis* 


Sign Score Sign 
L 
D 
Hy 
) Hs-—L 
| D-L 
Hy—L 
\Pd-L 
(is~& 
K< Hy—K 
| Pa—K 


(Mf 
Hs—My} 
, D—Mf 
Hy—Mf 
Pd—Mf 


L 


“sInT Nt M&M Ww 


Pa - D 
Pi—D 
Pi—Hy 
Sc—Hs 
Sc-—D 
Sc—Hy 
Sc—Pd 
Sc—Pt 


Hs—F 
D-—F 
Hy-—F 
| Pd-—F 


Wn WwW Dw NW bv 


F 


ALALAIAI 
“Nm ™ bo 


® For scoring individual profiles, add the number of significant 
signs in each category to obtain five scores. See Table 3 for 
appropriate weights 


Hy 
60.6 
66.5 


Mf 
50.0 
45.8 


Pd Pa 


63.9 
63.0 


64.8 


70.2 


Discriminant Function Analysis 


Inspection of the significant signs shows 
hat the majority fall into five logical cate- 
gories: (a) those involving the L scale; (db) 
those involving the K scale; (c) those involv- 
ing the F scale; (d) those involving the Mf 
scale; and (e) those involving the comparison 
of neurotic scales (Hs, D, Hy, Pd) with psy- 
chotic scales (Pa, Pt, Sc). The internal con- 
sistency of these groupings was high. Three 
signs do not fit into these five categories so 
easily. These are D, Hy, and Sc-Pt. These 
were assigned to the category with which 
they had the greatest correlation. D and Hy 
were placed in the L category and Sc-Pt was 
placed in the neurotic-psychotic (N-P) scales 
category. Table 2 has the signs arranged in 
these categories. 

With five categories of signs available, it 
was feasible to analyze the scores by means 
of a discriminant function (Johnson, 1949). 
The LZ values obtained are presented in Table 
3. A rough proportional value was utilized in 
the actual computation of results, and these 
weights are also presented in Table 3. 

For the K weighted profile, the application 
of these weights resulted in a mean score for 
the schizophrenic group of 58.1 and for the 
control group of 32.1. Using a cutting score 
halfway between the means of the two groups 
(i.e., = 46 = schizophrenic identification) re- 
sulted in 68% correct identification of schizo- 
phrenic Ss and 79% correct identification of 
control Ss. This result represents a slight im- 
provement over the unweighted sign scores. 


Cross-Validation 


An additional 49 Ss were obtained at the 
same hospital. These were all admission cases 
and the diagnostic procedure coincided in 
time with the administration of the MMPI. 
Twenty-eight of these Ss had a diagnosis of 
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schizophrenia, and 21 had a diagnosis of neu- 
rosis or behavior disorder. 

The scoring of individual profiles with the 
raw number of signs resulted in an over-all 
correct identification of Ss of 63%. The use 
of the discriminant function weights applied 
to the same scores raised the over-all correct 
diagnostic identification to 79%. The F ratio 
for the weighted scores is 22.53 for 1, 47 df, 
which is significant beyond the .001 level. 
The greatly improved prediction leads to the 
conclusion that the use of discriminant 
function weights facilitates the prediction of 
schizophrenic diagnosis with these signs. 
Table 4 contains a summary of statistical 
data for all groups utilized. It will be noted 
that the mean and SDs for each group are 
surprisingly similar from validation to cross- 
validation. 

MMPI data on female Ss were solicited 
from a number of other NP hospitals. At the 
time of writing, three hospitals had responded. 
The results for these hospitals are combined 
in Table 4 for easy comparison with the other 
data. The F ratio for these scores was 41.84 
for 1, 93 df, which is significant beyond the 
001 level. Slightly reduced efficiency in dis- 
crimination is noted from the cross-validation 
in the original hospital. 


Discussion 


The schizophrenic signs obtained from a 
configural analysis of the MMPI profile show 
considerable predictive accuracy and consid- 


Table 3 


Discriminant Function Weights for Predicting 
Diagnosis of Schizophrenia with Approvxi 
mations Used in the Study" 


Category 
of Signs 


L Value \pproximations 


I 074 
A 023 
I 014 
Vi 012 
VP 092 


* Multiply the approximate weight f 
third column above by the score in eact 
2) and add for a single total score 
w exceed 46 were identified as schizophrenic in this stud 
Different cutting scores may be set in other situations should 
the situation demand quick discrimination of all possible pa 
tients of one or the other category 


r each category in 
category (from Table 
Those scores which equal 


lable 4 
Discrimination Between Schizophrenic and Contro] 
Groups with Configural Analysis 


(Discriminant function weights—-cutting score 546 


Groups 


alidation 


Schizophreni 
Control 


“Subtle” sc hizophre ni 


“Obvious” schizophrenic 
ross-validation 

same hospital 
Schizophreni 
Control 
oss-validation 

other hospitals 


™ hizophre nic 
Control 


erable stability from sample to sample. This 
methodology appears to offer more promise 
than other approaches for predicting specific 
diagnoses. In addition, the methodology is 
easily adapted to differentiating personality 
variables other than diagnostic categories and 
easily adapted to other tests than the MMPI. 

An optimally controlled predictive study 
would have two features: (a) the MMPI 
results would not be known to the physician 
when he assigned diagnosis, and (0) the 
MMPI would have been administered at ap- 
proximately the same time that diagnosis was 
made. Very few cases in this study fit both 
criteria. Insofar as the statistics are concerned, 
the two flaws would seem to counterbalance 
each other, i.e., the physician’s knowledge of 
the test results should artificially inflate the 
number of correct identifications and the in- 
congruence of time between test administra 
tion and diagnosis should artificially lower 
the number of correct identifications. Fortu- 
nately, two of the other hospitals used in 
cross-validation supplied cases in which the 
test was not used as part of the diagnostic 
process. Unfortunately, these cases 
which had been on the ward for some time 
and probably included schizophrenics in re- 
mission and other cases whose clinical picture 


were 
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had changed since the diagnosis was assigned. 
Thus the statistic of correct identification 
would be artificially lowered in this instance. 
The over-all correct identification for this 
group of 48 patients was 69%. Since this is 
a low estimate of predictive efficiency as a 
result of incongruence of time between testing 
and diagnosis, it would appear that the physi- 
cian’s knowledge of test results does not 
grossly inflate these statistics. 

The degree of statistical accuracy obtained 
in predicting the diagnosis of schizophrenia 
(approximately 75%) is adequate in itself. 
Intimate knowledge of the cases used makes 
the results more impressive. Those cases which 
were “misdiagnosed” seem to fall into clearly 
defined groups. The control Ss identified as 
schizophrenic fell into several categories: (a) 
flat, unresponsive, inadequate people usually 
diagnosed conversion reaction or personality 
or character disorder; (6) immature and im- 
pulsive people usually diagnosed personality 
or character disorder; and (c) marginally 
psychotic patients who did not give sufficient 
evidence for a diagnosis of schizophrenia. 
The schizophrenic Ss identified as nonschizo- 
phrenic fall into two general categories: (a) 
those who have had a schizophrenic psychosis 
but are partially or entirely recovered from it 
and (6) those who have currently active 
schizophrenic symptomatology but who also 
have many neurotic symptoms. This latter 
group follows a nonschizophrenic pattern of 
hospitalization in that they are sociable, out- 
going people who make quick recoveries and 
are discharged as quickly as the neurotic 
patients. Thus in the majority of instances, 
there appears to be good reason for cate- 
gorizing a particular patient in the wrong 
diagnostic group. Descriptively, the misiden- 
tified patient often resembles patients of the 
other category to a greater extent than he 
resembles the patients of his own group. 

It is quickly noticed that virtually all of 
the significant signs involve neurotic scales 
of the MMPI. The psychotic scales fail to 
predict schizophrenic diagnoses in this sample 
unless their relative height is contrasted with 
the neurotic scales. Second most frequent in 
number of signs are the validation scales L, 
K, F. This appears to indicate that test-taking 
attitude is important in differentiating schizo- 


William J. 


Eichman 


phrenic disorders from the neuroses and be- 
havior disorders. It is interesting that so many 
predictive studies using the MMPI have ig- 
nored the validating scales except for the 
purpose of selecting “valid” records. It would 
seem that these validating scales are a source 
of valuable clinical information and should 
not be ignored. 

The use of discriminant function analysis 
seems to have been a valuable addition to the 
methodology. Not only has the method in- 
creased predictive efficiency, but it permits 
more accurate analysis of what signs predict 
best. From the categories of signs, we can 
gain some knowledge of the aspects of test 
performance which differentiate the schizo- 
phrenic from the control group. The group of 
signs which receive the greatest weight con- 
trasts the relative levels of neurotic and psy- 
chotic signs. Thus the female schizophrenic S 
admits to relatively more psychotic symptoma- 
tology than she does to neurotic symptomatol- 
ogy. On an absolute basis, she does not admit 
to more psychotic sympomatology than her 
neurotic counterpart. Second in order of im- 
portance are the L group of signs. It seems 
that the schizophrenic attempts naive decep- 
tion. She tries to conceal symptomatology of 
a neurotic variety from herself and from 
others. Frequently, this goes into a denial of 
illness syndrome, and the patient is usually 
described as “lacking in insight.” Third is the 
F group of signs. These contrast the neurotic 
scales with the F scale. The schizophrenic 
patient shows relatively more confusion and 
inattention as compared to the number of 
admitted neurotic symptoms. Once again there 
is not significant difference in the F scale 
alone between the groups. The fourth set of 
items contrasts the Mf scale with the neurotic 
scales. The schizophrenic patient shows more 
masculine attitudes in the absolute sense as 
well as relative to the amount of neurotic 
symptomatology expressed. The fifth group 
of signs contrasts score on the K scale, which 
measures a subtle form of defensiveness, with 
several of the neurotic scales. Taken indi- 
vidually, each of these signs shows that the 
schizophrenic female is being more defensive 
in relation to the number of neurotic symp- 
toms expressed. Nevertheless, the discrimi- 
nant function weight is negative, indicating 
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that in combination with the other categories 
of signs, this subtle defensiveness is more pre- 
dictive of nonschizophrenia. The general rea- 
son for this reversal appears to be in the fact 
that Z and K as individual scales are highly 
correlated and the Z and K categories of signs 
are also highly correlated, both measuring 
something akin to defensiveness or saying 
good things about oneself. The situation ap- 
pears to be similar to what often occurs in 
factor analysis. After the first factor is re- 
moved, two highly correlated measures might 
well correlate negatively with each other. 
Defensiveness, as measured by the K scale, 
is believed to represent more potential for 
adjustment in the patient than is defensive- 
ness as measured by the ZL scale. Perhaps a 
better way of expressing it is that the Z scale 
seems to measure blind denial of reality, while 
the K scale appears to measure more subtle 
and effective defenses against reality. K is a 
more complex scale than ZL and the only 
unique contribution the K category of signs 
has to make when in combination with the 
other signs appears to be in an opposite 
direction from when the K category is taken 
alone. This result is in accord with clinical 
observation. 


Summary 


1. Sullivan and Welsh’s approach of con- 
figural analysis was applied to the MMPI 
profiles of female schizophrenics and controls 
in an NP hospital setting. A scale of signifi- 
cant signs was developed with these validating 
groups. 

2. The significant signs were grouped into 
five a priori, logical categories and subjected 
to discriminant function analysis with sig 
nificantly improved predictive efficiency. 

3. The weighted sign scores were then 
cross-validated in the same hospital and in 
three other hospitals with approximately 75% 
accuracy in 
from controls. 


discriminating schizophrenics 
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ANXIETY INDICES IN SAME-SEX DRAWINGS OF 
PSYCHIATRIC PATIENTS WITH HIGH AND 
LOW MAS SCORES’ 


THOMAS E. HOYT? ann MARTIN R. BARON 


Kent State University 


Since the publication of Buck’s (1948, 
1950) and Machover’s (1949) monographs, 
the House-Tree-Person (H-T-P) and Draw- 
a-Person (DAP) have become instruments in 
common use by the clinical psychologist. This 
use has been accompanied by many studies 
concerning the reliability and validity of the 
projective devices. A recent article (Swenson, 
1957) ably summarizes the evidence to date, 
most of which leaves serious doubts regard- 
ing the validity of many hypotheses con- 
cerning the meaning of drawing characteris- 
tics. Despite these findings, many writers feel 
that the technique holds promise. 

In suggesting possible uses of the H-T-P, 
Buck (1948) has pointed out some character- 
istics of the “person” drawing which are often 
noted among patients clinically diagnosed as 
anxious. Similarly, Machover (1949) has spe- 
cified in the use of the DAP those elements 
in pictures which in her clinical experience 
are indicative of anxiety. In addition, they 
both mention characteristics of drawings 
which they believe indicate personality char- 
acteristics generally thought to be associated 
with anxiety. The purpose of this study is to 
compare the frequencies of anxiety indices in 
psychiatric patients classified as anxious or 
nonanxious on Taylor’s Manifest Anxiety 
Scale (MAS) (Taylor, 1953). 


Indices Investigated 


Table 1 lists all the characteristics of 
human figure drawings which Buck and/or 


1 This study constitutes a portion of a thesis per- 
formed under the direction of the second author in 
partial fulfillment of requirements for the M.A. 
degree at Kent State University, August, 1955. 

2Formerly at State Bureau of Vocational Re- 
habilitation, Youngstown, Ohio. Now deceased 
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Machover believe show anxiety; Table 2 lists 
characteristics indicated by Buck and/or 
Machover to be related to personality and 
considered by the principal author to be worth 
checking as potential anxiety indicators. All 
characteristics listed in Tables 1 and 2 are 
investigated in the present study. 

Evidence from other investigations regard- 
ing the validity of these indices will be con- 
sidered at a later point. 


Criterion Measure 


The objective measure of anxiety with 
which the various anxiety indicators listed in 
Tables 1 and 2 are compared in this study is 
the Manifest Anxiety scale (Taylor, 1953). 
Designed not as a clinical tool but to select 
subjects for experimen.al groups, the test has 
been employed to select extreme groups pre- 
sumably in regard to anxiety; performance 
measures for these groups are usually found 
to differ (Taylor, 1953). In accordance with 
the use typically made of the MAS, high and 
low anxiety groups were selected in the pres- 
ent study, and a tally then made of the 


Table 1 


Characteristics of ‘Person’? Drawings Investigated in 
the Present Study and Considered by Buck and/or 
Machover to be Indicators of Anxiety 


Buck 


Reference 


Characteristics of Machover 


Drawing Reference 

Placement (upper left- 
hand corner) 

Type of line (faint 

Reinforcement 

Shading 

Erasing 


(1948, p 
(1948, p 
(1948, p 
(1948, p 


(1949, p. 110 
(1949, p. 98 
1949, pp. 98, 
110 
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Table 2 


Characteristics of ‘“‘Person” Drawings Investigated in the Present Study and Considered by Buck and/or 
Machover to be Indicators of Personality Attributes Other Than Anxiety 


Characteristic of Drawing 


*Type of line (faint 
*Erasing Conflict 
Size (small Inadequacy 
Relative size of head to figure 
Power 
Size of head 
Body Impulses 
Omissions Helplessness 
Guilt 
Conflict 
Body area ‘out of proportior 
Lack of Insight 


Poor Reasoning 
® The characteristics type of line and erasing are also inch 


number of anxious and nonanxious subjects 
displaying the various anxiety indices in their 
drawings. 


Method 
Subjects 


Same-sex institutionalized 
women are used as the basis of this study. 
The Ss were neurotic and psychotic women 
who were new admissions to Woodside Re- 
ceiving Hospital at Youngstown, Ohio between 
January and June 1955. Tests were admin- 
istered almost weekly during this period in 
order that the Ss would be tested shortly 
after admission. The selective factors required 
that they be: (a) between 20 and 65 years 
of age; (6) able to read and write; and 
(c) in a sufficient state of awareness and sta- 
bility (as judged by the resident psycholo- 
gist) to participate as Ss. 

A total of 125 women was tested, but for 
various reasons, such as illness, incomplete 
drawings, etc., 13 were eliminated. The 112 
Ss were divided into three groups according 
to their MAS scores. The 27% with the high- 
est scores and the 27% with the lowest scores 
were used for this study. This resulted in 
60 Ss, one group of 30 with a score on the 
MAS of 1 to 12, to be called “low anxiety,” 


dr AWINLS of 


Personality Attribute 


limid; Uncertain; Self-effacing 


Relative emphasis on Intellectua 


Social Dominance and Control of 


Shallow Emotionalit 


Buck 


Reference 


Machover 
Reference 


t 


1949 pp 95-96 


1949. p 


1949 p 


1949, p 
1949 p 
1949, p 
1949, p 
1949, p 


led in Table 1 


and one group of 30 with a score on the 


MAS of 25 to 43, to be called “high anxiety.” 


Procedure 


Testing was done in a group situation, the 
number of Ss in the group being from 5 to 
15 women. The MAS was administered to 
the Ss, then the same-sex drawing test. Fol 
lowing this, a group Rorschach was given. 
Finally, Ss were asked to draw a second figure 
of their own sex. The data herein considered 
concern only the first drawings. 

Scoring and Reliability 

An attempt reported elsewhere (Hoyt, 
1955) was made to devise a refined scoring 
manual to measure the various possible 
anxiety indices from the drawings. All draw- 
ings were first coded so that the examiner 
would not know which pictures were drawn 
by high and which by low anxiety Ss, and 
then scored on one index at a time. 
values usually from zero to three were as- 
signed.’ Reliability of this scoring procedure 


score 


The aim of Hoyt’s (1955) original study was to 
determine the existence of 
between graduated various anxiety in 
dices and MAS this, he utilized a 
pilot group to determine for each index which way 
of scoring amounts of 


significant 
amounts of 
scores To do 


relationships 


“presence” would correlate 
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Table 3 


Proportion of Anxious and Nonanxious Ss Displaying 
Various Characteristics in Their Drawings, with 
Chi Square Values and Levels of Confidence 


Proportion 

Displaying 
Characteristic 

Level 
of Con 
fidence 


Non- Chi 
anxious Square 


Characteristic of 
Drawing 


Anxious 


Placement (upper 

left-handcorner)  .433 
Type of line (faint) .167 0.00 
Reinforcement .700 1.20 
Shading 433 d 07 
Erasing 833 0.00" 
Size 433 3.88" 
Relative size of 

head to figure 533 62 
Size of head 433 94" 
Omissions 400 1.22 
Body area out of 

proportion 467 53. 26 


3.88" 


* Corrected for continuity 


was determined by computing for each index 
the mean absolute difference between the 
scores assigned by the principal investigator 
and those independently assigned by a second 
psychologist for 112 Ss. Since the mean dif- 
ferences were generally small (only two were 
larger than .111), the reliability of this scor- 
ing procedure was good. For the purposes of 
the present report, all anxiety indices have 
been rescored into dichotomies such as pres- 
ence or absence. Since most instances of dis- 
agreement between the two scorers occurred 
regarding amounts of presence, utilizing two 
categories serves to increase reliability. Fol- 
lowing the rescoring on each index, the data 
were analyzed by finding the significance of 
difference between the high anxiety group and 
the low anxiety group on each index by the 
use of the chi-square test. Yate’s correction 
was employed on all tables where there was 
an expected value less than 10. 


best with the MAS. He hoped to test the hypothe- 
sized indices and scoring procedures on a new inde- 
pendent sample; however, a new sample large enough 
to perform an adequate test of the utility of his 
scoring procedure could not be obtained. Because of 
this unfortunate characteristic of Hoyt’s (1955) de 
sign, his scores have been dichotomized into what 
might well be considered a priori categories 


Results 


The proportion of anxious and nonanxious 
Ss displaying the various indices in their 
drawings, the chi-square values for each index 
and the levels of confidence appear in Table 
3. In view of the fact that virtually all Ss 
shaded the hair of the person in the drawings, 
shading was counted only for body shading. 
Two of the indices were found to differentiate 
significantly between high and low anxiety 
groups on the basis of the MAS criterion. 

Placement in upper left-hand corner and 
small size were found to be the most 
valid indices according to the MAS criterion 
with chi-square values significant at the .05 
level of confidence. Reinforcement and omis- 
sions were found to differentiate to some de- 
gree between high and low anxiety groups, 
but the chi-square values would be exceeded 
30% of the time by chance alone. The other 
indices were found to discriminate not at all 
between high and low MAS groups. Thus, of 
10 anxiety indices, only two are found to be 
valid at the .05 level when separately em- 
ployed utilizing the MAS as a criterion of 
anxiety. 

It is perhaps worthwhile at this point to 
re-examine Buck’s and Machover’s hypothe- 
ses. Are they saying that the presence of 
shading (as an example) is indicative of 
anxiety? If this is so, then our chi-square 
test of each index separately has been appro- 
priate. Or are they saying that anxiety may 
be indicated in a number of ways; that the 
presence of any one of many indices predicts 
anxiety? If the latter is true, then the results 
reported in Table 3 are not a fair test of 
their hypotheses. The data reported in Tables 
4 and 5 are relevant to the hypothesis that 
that any S, whose drawing has any one of 
the characteristics indicated by Buck and 
Machover to be indices of anxiety, is more 
likely to be anxious than is an S whose draw- 
ing lacks any of these characteristics. 

Machover’s (1949) indices are reinforce- 
ment, shading, and erasing. None of these, it 
will be recalled, is significantly related to the 
criterion. A count of the number of high and 
low anxiety Ss, shown in Table 4, whose pic- 
tures contained any 
showed 87% of the anxious and 93% 


indices, 


of the 


one of these 
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nonanxious Ss to have displayed one of these 
characteristics in their drawings. Because the 
proportion of cases showing at least one of 
these characteristics is so close to 1.00, chi- 
square computation is unjustified. It is clear 
from the figures, however, that the presence 
of one of Machover’s indices is not found 
more frequently in anxious than nonanxious 
Ss. 

The anxiety indices mentioned by Buck 
(1948, 1950) are placement in upper left- 
hand corner, light type of line, reinforcement, 
and shading. Of these, the first is significantly 
related to our MAS criterion. A contingency 
table (Table 5) was constructed to determine 
if the presence of any of Buck’s four indices 
was more likely to be found in drawing of 
high MAS than those of low MAS Ss. It is 
notable here, too, that a very high proportion 
of both anxious and nonanxious Ss displayed 
at least one of Buck’s indices in their draw- 
ings. Clearly, the presence of any one of his 
indices does not differentiate high and low 
MAS groups; chi square equals zero. 


Discussion 


It was apparent to the authors at the out- 
set of this study that there might very well 
be little correspondence between anxiety, 
clinically diagnosed, and manifest anxiety, as 
measured by Taylor’s MAS scale. Some might 
argue that skilled interpretations of drawings 
of the human figure would be a better cri- 
terion of anxiety than the MAS, and thus a 
correlation between MAS and drawing inter- 
pretations would indicate the validity of the 
MAS. Considering that the present data are 
simply correlational, it is noteworthy that 8 
of the 10 anxiety indices investigated in this 
study were found not significantly related to 


Table 4 


Proportion of Anxious and Nonanxious Ss Displaying 
and Not Displaying Any One of Machover’s 
Anxiety Indices (Reinforcement 
Shading, and Erasing) in 
heir Drawings 


Not 
Displaying Displaying 
Anxious 867 133 
Nonanxious 933 067 


Table 5 
Proportion of Anxious and Nonanxious Ss Displaying 
and Not Displaying Any One of Buck's Anxiety 
Indices (Placement, Type of Line, Rein 
forcement, and Shading) 


Not 


Displaying Displaying 


Anxious 867 133 
Nonanxious R33 167 


the MAS criterion. It is perhaps more worthy 
of note, on the other hand, that two of the 
indices, one of them specifically indicated by 
Buck as an anxiety indicator, are significantly 
related to the criterion. Since it is unlikely 
that 2 of the 10 indices could be related to 
the criterion (at the .05 level of confidence) 
by chance, the authors take the position that 
the interpretations of the drawings and MAS 
indicate more in common than might be ex- 
pected on a chance basis. 

It is perhaps worthwhile at this time to 
compare the evidence obtained in this study 
with other findings described in the literature. 
Table 6 is presented to facilitate the com- 
parison. 

Here it can be seen at a glance that, accord- 
ing to evidence prior to this study, only two 
characteristics are related to anxiety: size and 
size of head. Of these, size was also found 
significantly related to our MAS criterion. In 
addition, placement was found before not to 
be related to anxiety, whereas in the present 
study it was related to MAS scores. Thus, the 
findings of the present study are partially in 
agreement with those reported in the litera 
ture, and point up the relevance of placement 
where such relevance had not been previously 
confirmed. 


Using the MAS criterion, it becomes appat 


ent, as shown in Tables 4 and 5, that the 
presence of any one of the anxiety indices 
(either of Machover or Buck) does indicate 
anxiety; however, it is also true that they 
indicate nonanxiety just as often. One possible 
interpretation of this finding is that in clini- 
cal experience one may be tempted to remem- 
ber only positive evidence and as a conse- 
quence to continue employing an instrument. 
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Table 6 


Characteristics of “‘Person” Drawings Investigated in 
This Study as Possible Anxiety Indicators with 
the Nature of Prior Findings and Present 
Findings Indicated Separately 


Present 


Characteristics Prior Findings Findings 


Placement negative (Goodman & 


Kotkov, 1953) 


positive 


Type of line negative (Royal, 1949) negative 


Reinforcement negative (Royal, 1949) negative 


Shading negative (Royal, 1949) negative 


Erasing negative (Goldworth, 
1950; Royal, 1949) 

positive (Lehner & 
Gunderson, 1953) 


negative 
Size positiv e 


Size of head and 
relative size of 
head to figure 


positive (Goldworth, negative 


1950) 


Omissions positive (Goldworth, negative 
1950) 
negative (Goldworth, 


1950; Royal, 1949 


Body areas out 
of proportion 


positive (Goldworth, negative 


1950) 
negative (Prater, 
1950; Royal, 1949) 


In the light of the present evidence, it is 
suggested that attempts be made to determine 
if the presence of any of these indices is more 
common among Ss clinically diagnosed as 
anxious than among those diagnosed as non- 
anxious. If data comparable to those in Tables 
4 and 5 are obtained, using clinical criteria, 
virtually all psychiatric patients are anxious, 
and the presence of these indices is not par- 
ticularly worthy of note except as possible 
sources of suggestion to the therapist regard- 
ing possible anxiety sources. 
Summary 

The purpose of this study was to determine 
whether various indices clinically employed 
by Buck and/or Machover to diagnose anxiety 
were valid, using the Taylor Manifest Anxiety 
Scale (MAS) as a criterion. Other indices 
were also investigated. 

The indices investigated were placement in 
the upper left-hand corner of the page, faint 


type of line, reinforcement (i.e., retracing), 
shading, erasing, size, relative size of head to 
figure, size of head, omissions, and body areas 
out of proportion. 

The 112 Ss who were employed in this 
study were female admissions to a mental 
receiving hospital. These Ss were given the 
MAS and then asked to draw a figure of their 
own sex. The highest and lowest 27% of these 
112 Ss, as determined by the MAS, were 
classed respectively as anxious and nonanxious 
and the drawings of these 60 Ss independently 
scored for presence or absence of each index. 

Findings showed placement and size to be 
significantly related to MAS scores. None of 
the other characteristics significantly differ- 
entiated between the anxious and nonanxious 
patients. 

Received October 2, 1958 
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NOMOGRAPHS FOR COMPUTING THE “VALIDITY” OF 
WISC OR WECHSLER-BELLEVUE SHORT FORMS 


CECIL BRIDGES 


Department of Pupil Services, Oklahoma City Public School 


In the course of the selection of a short 
form of the Wechsler Intelligence Scale for 
Children (WISC) to be used as a screening 
instrument in special education, the problem 
of calculating numerous correlations of possi- 
ble short form subtest combinations with total 
score arose. McNemar (1950) has shown that 
the correlation between abbreviated forms of 
the Wechsler-Bellevue (WB) Scale and total 
score may be calculated from correlation mat- 
rices presented by Wechsler (1944) in his 
manual. This is true of the WISC 
(Wechsler, 1949). 

There are several instances in which com- 
putation of such “validity” coefficients may 
be useful. For instance, in selecting a short 
form screening instrument from subtests, fac- 
tors other than WB short form correlation 
with total score might enter into the selection 
(ease of administration, etc.), and the “va- 
lidity” of the chosen battery might not be 
among those given by McNemar. In the case 
of the WISC no such coefficients are avail- 
able. 

In using McNemar’s formula, the calcula- 
tions require a desk calculator if reasonable 
speed is desired. The nomographs presented 
here were devised to eliminate all calculations 
except addition necessary to obtain the sum 
of the correlations of the tests and the sum 
of their intercorrelations. One nomograph is 
used for computing the short form correlation 
with total score when the short form consists 
of two, three, or four subtests. A second 
nomograph is used for computing the corre- 
lation when five, six, or seven subtests are 
used. With the use of a nomograph, calcula- 
tion is considerably faster than with a desk 
calculator. 

To use the nomograph, first find the sum of 
the r’s for each subtest from the 


also 


selected 


correlation matrix presented in Wechsler’s 
manual (there will be nine of these r’s for 
each test) and add these sums.’ Next find the 
sum of the intercorrelations between the se- 
lected subtests. 

The horizontal scales on the nomograph 
correspond to the sum of the r’s (the first 
quantity mentioned above). The _ vertical 
scales on the nomographs correspond to the 
sum of the intercorrelations of the chosen 
tests. The curved lines in the body of the 
nomograph correspond to the correlation be- 
tween the short form and total score. Thus, 
to find the correlation of the short form with 
total score, find a point on the appropriate 
horizontal scale corresponding to the sum of 
the r’s (interpolate if necessary) and lay a 
straight edge through this point perpendicular 
to the scale. Find a point on the appropriate 
vertical scale corresponding to the sum of the 
intercorrelations, and lay a straight 
through this point perpendicular to the scale. 
If the intersection of the two perpendiculars 


edge 


Fig. 1 
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Fig. 2. Nomograph for computing the correlation 
of short form with total score when five to seven 
subtests are used. 


falls on one of the contours of equal correla- 
tion, the value of r with which that line is 
labelled is the correlation of the short form 
with the total. If the point of intersection falls 
between contours of equal correlation, interpo- 
lation is necessary. The correlation found will 
usually be correct in the second decimal place 
if interpolation is done carefully.” The nomo 


* There is a negligibly small constant error of about 
'% of 1% involved in the use of a single set of nomo 
graphs for both tests. The use of a single.set of 
nomographs is possible because the quantity in the 
first radical of the denominator of McNemar’s equa 


graphs are strictly applicable only to the 
populations of Wechsler’s (1944) Table 41 
and Wechsler’s (1949) Table V. Because the 
subtest correlations are not perfectly reliable, 
there is some capitalization on chance when 
a battery of subtests is selected for maximum 
correlation with the total, and a slight shrink- 
age of this correlation would be expected in 
cross-validation. 

Because of a small arithmetical error in 
McNemar’s special formula (Howard, 1958), 
WB “validities” obtained with these nomo- 
graphs will be slightly higher than corre- 
sponding “validities” presented in McNemar’s 
original article. 

Received October 7, 1958. 
REFERENCES 
Howarp, W. A note on McNemar’s “On abbreviated 

Wechsler-Bellevue scales.” J. consult. Psychol, 

1958, 22, 414 
McNemar, Q. On abbreviated Wechsler-Bellevue 
J. consult. Psychol., 1950, 14, 79-81. 
Wecuster, D. The measurement of adult intelligence 

(3rd ed.) Baltimore: Williams & Wilkins, 1944. 
Wecuster, D. Wechsler intelligence scale for chil 

dren: Manual. New York: Psychological Corp., 

1949 


scales 


tion is approximately the same for the WISC Table 
V (Wechsler, 1949) and WB Table 41 (Wechsler, 
1944 





Journal of Consulting Psychology 
Vol. 23, No. 5, 1959 


FORCED ASSOCIATIONS, SYMBOLISM, AND 
RORSCHACH CONSTRUCTS 


JOSEPH F. RYCHLAK 


Washington State University 


An interesting and problematical aspect of 
Rorschach test interpretation is that of con- 
tent analysis. Is it legitimate to base hypothe- 
ses about personality on the content of a 
subject’s perception, or should the clinician 
limit himself to the more rigorous procedure 
of formal scoring? Piotrowski (1957, p. 324) 
notes that the latter practice is on the decline 
and that psychologists are relying increasingly 
on content analysis. 

Rorschach (1943, p. 207) suggested that 
content of inkblot interpretation might have 
a meaning of its own, but believed this to be 
determined primarily by significant relation- 
ships existing between form and content. A 
review of representative current 
Rorschach interpretation Learned, 
Metraux, & Walker, 1954; Beck, 1952; 
Freud, 1938; Lindner, 1952; Rorschach, 
1943; Sarason, 1954; Schafer, 1954) reveals 
that none actually denies the possibility of 
content responses carrying a meaning of their 
own. Usually the author cautions against hasty 
symbolic interpretations and strongly urges 
that content is only one of several dimensions 
—and probably a lesser one—to consider in 
any given protocol. Often the context of the 
content response is stressed, with the implica- 
tion that the meaning of the latter varies with 
the former. Following such admonitions, how- 
ever, several authors go on to hypothesize a 
given meaning for a given content response 
(Allen, 1954, p. 85; Halpern, 1953, p. 37; 
Klopfer, Ainsworth, Klopfer, & Holt, 1954, 
p. 385; Phillips & Smith, 1953, p. 120). Such 
proposals may be related to a specific area of 
a card, but this is not always the case. 

Where do these interpretations come from? 
Clinical experience is usually cited as the 
major source. However, since clinicians do 
not work in theoretical vacuums it is not sur- 
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prising to find that many of the symbolic 
interpretations made on the basis of content 
stem from the psychoanalytical literature 
(Piotrowski, 1957, p. 324). A more basic 
question is how valid is it to generalize a 
content interpretation based upon past clinical 
cases to the present one? Do people really 
ever see the same blot or, more profoundly, 
the same content? How can one check on the 
meaning of a content response? We are prob- 
ably a long way from any conclusive method 
for the testing of content meaning. Osgood 
(1952) has presented a method which may 
eventually develop into an adequate test for 
the meaning of symbols. Some would argue 
that this will never be possible on a large 
scale and that clinical evaluation will remain 
the best bet. Although agreeing in spirit with 
the latter position, the author would like to 
describe a method for testing, in gross fashion, 
the relationship between a given construct and 
a given meaning.’ 

The method employs a simple forced asso 
ciation between arbitrarily chosen Rorschach 
constructs (Clouds, Fire, etc.) and a series 
of six “meanings” (Love, Anger, etc.) often 
assigned these constructs (rightly or 
wrongly) by clinicians. No claim is made as 
to the actual meaning of a construct; all the 
method aims to do is assess whether or not 
people can agree in their choices between 
some of the meanings often assigned by clini- 


It 


choice between various meanings, 


to 


cians. is also of interest to see, given a 


which one 
will be most frequently chosen. 

The author would like to thank the following 
people for their help in arranging Ss and analyzing 
the data: Julian C. Davis, Florida State Hospital; 
Barron B. Scarborough and Charles Moos of Florida 
State University; and John W. Neff, Clinical Dire 
tor, Pueblo, Colorado Mental Hygiene Center 


5 





456 Joseph F. 

A construct is seen as a cognitive abstrac- 
tion from sensory stimulation, which may 
have more than one circumscribed pattern 
of stimulation as a referent. The construct 
“foggy,” for example, can refer to an atmos- 
pheric condition or a state of mind. Since the 
former association is more frequent, we use 
the term “symbolical” to describe the situa- 
tion when the latter association is utilized. 
Osgood (1952, p. 200) differentiates signs 
from objects, and the distinction can help in 
clarification here. An object is any pattern of 
stimulation (from a hammer to a gust of wind 
in the face) which evokes reactions on the 
part of an organism. A sign is any pattern of 
stimulation other than the object which evokes 
reactions relevant to the object. The theoreti- 
cal position taken here is that Rorschach con- 
structs are such signs. If Rocks symbolize 
Security (Halpern, 1953, p. 37) what the 
clinician must mean in this suggestion is that 
a meaning is conveyed by the construct Rock. 
He—the clinician—has learned to identify 
this less frequent association which people 
have between feelings of security (object) and 
rocks (sign). The process of projection would 
be the opposite side of the coin: a person has 
need of security (object), hence presumably 
chooses from the amorphous stimulus of the 
inkblot a symbol (sign) typifying the need. 

The present study is based upon a social 
psychological rationale. It is assumed that if 
constructs chosen from the Rorschach have 
the meanings often ascribed to them by clini- 
cians, then a polling of assorted groups should 
reflect these meanings. This is so because 
cultural influences must carry the consistency 
from person to person and, except for minor 
subcultural differences, broad similarities be- 
tween groups should prevail. As a theory of 
symbolism, the point of view here espoused 
comes closer to that of Hall (1953) than to 
the more traditional psychoanalytical view, 
although Freud certainly recognized the cog- 
nitive as well as defense functions of the 
symbol. 


Method 
Instrument 


A forced association test, consisting of three 
pages and devised for group use, was admin- 
istered to the Ss. Each page dealt with the 
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following 12 constructs: Boots, Smoke, Bear 
(animal), Mask, Fur, Fire, Clouds, Rocks, 
Hair, Bat (animal), Island, Mountains. A 
description of each page follows: 


Page 1: Constructs were ranked at the left-hand 
side of the page and to their right were two columns 
of blank spaces. The columns of blanks were titled 
Positive Meaning and Negative Meaning, making it 
possible for an S to assign a positive or negative 
valence to each construct. 

Page 2: Constructs were arranged so that each 
was followed by three meanings: Ambition, Love, 
and Security. A check space preceded each meaning 
so that a construct could be associated with one of 
the three alternatives by simply placing a check be- 
fore the meaning “coming closest” to the construct. 

Page 3: Similar to page 2, except that each con- 
struct was followed by the meanings of Depression, 
Fear, and Anger. 

The Rorschach constructs were selected at 
random from those frequently reported by Ss 
taking the test. Two points need to be made 
concerning the list of meanings. First of all, 
as discussed in the introduction, in another 
context they could serve as constructs. They 


~are arbitrarily “meanings” here only because 


of their representativeness as frequently made 
interpretations of the constructs. For this rea- 
son, the E defined each meaning in the experi- 
mental instructions to Ss, but did not define 
the constructs. Secondly, the meanings were 
chosen as a result of extensive pretesting dur- 
ing which it was discovered that Ss found it 
difficult to differentiate between meanings 
having similar connotations, such as Anger 
and Hatred, or Sex and Love. Of course, it 
is possible to think of Love as an Ambition, 
or reflecting Security, but it was found 
through interviewing Ss that the meanings 
eventually chosen were less likely to be con- 
fused than those of, e.g., Anxiety and Fear. 
Psychologists may have a clear conception of 
the differences between the latter, but Ss 
rarely do. 

Consequently, the final instructions were so 
phrased as to make a broad inclusion of con- 
notations under a meaning possible. Page 2 
instructions included the following brief over- 
view by the E: 

Ambition meaning of course the desire to get ahead 
or improve one’s circumstances . . . . Love refers to 
all kinds of warm emotions or feelings toward other 
people or other things . and Security meaning 
the feeling of safety and protection one experiences 
in certain situations 
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Page 3 had the following clarifying remarks: 


Depression meaning of course a feeling of sadness 
and gloom, a sort of “what’s the use?” notion... . 
Fear means the feeling of fright we get when some 
thing scares us or worries us and makes us anxious 

. and Anger is the kind of irritable feeling of 
being mad or burned-up at someone or something 


No suggestions regarding the definition of 
constructs were given, except to clarify that 
two of them (Bear and Bat) referred to 
animals. 

Thirty-five Ss (20 female, 15 male) attend- 
ing an undergraduate psychology course were 
given a one-week test-retest reliability check, 
evaluated with the binomial expansion (Ed- 
wards, 1957, pp. 43-48). The mean CA of 
these Ss was 22.6, and their mean educational 
level was 14.9. In order to reach the .05 level 
of significance, an S must match his initial 
choice, on retest, at least nine times for page 
1, and at least six times for pages 2 and 3. 
On this basis, 97% of all Ss reached the .05 
level on page 1, 97% reached it on page 2, 
and 86% reached it on page 3. To demon- 
strate the upper limits of reliability, a more 
rigorous level of significance can be pointed 
to. For example, 51% of the Ss had 11 or 
more matches (.003) on page 1; 74% had 
eight or more (.007) on page 2; and 40% 
had eight or more (.007) on page 3. Thus, 
it appears that constructs and meanings can 
be associated with an acceptable degree of 
consistency. 


Subjects 


The 
(N = 


following 
160): 


groups of Ss were tested 


University Students: n= 67 (30 female, 37 
mean CA 21.4; mean education 14.2 

Male Extension Students: = 22; mean CA 27.3; 
mean education 12.8 

Female Extension Students: = 25 
mean education 14.8 

Male Mental Patients (MPs): »=23; mean CA 
32.0; mean education 11.6 

Female MPs: n 23; mean CA 40.1; mean educa 
tion 11.8 


male) ; 


; mean CA 40.2; 


The university students were drawn from 
introductory psychology classes. Male exten- 
sion students were USAF personnel, attending 
night classes at a nearby airbase. The female 
extension students included several preschool 
teachers. The MPs were in residence on the 


receiving wards of a State Hospital. Although 
some were bizarre, in general they were not 
the severely regressed type of patient. The 
only criteria of selection adhered to in choos- 
ing the MPs were that they have either (a) 
a high school education or (6) two years of 
high school and score within the average range 
of intelligence (or above) on a standard IQ 
test (Wechsler or Binet). All forms of diag- 
nostic categories were included, from charac- 
ter disorders through neuroses and psychoses 


Prov edure 


The study was structured for Ss as a “sur 
vey of concept meaning.” They were told that 
the E was interested in knowing what most 
people “associate—or connect—with certain 
concepts.” All testing was done in groups, but 
the MP sample involved much more intimate 
groupings of two to eight Ss. After checking 
a positive or negative valence on page 1- 
presented through an example in the instruc- 
tions as roughly equivalent to liking or dis- 
liking the construct—Ss went on to complete 
pages 2 and 3 as a group (i.e., all Ss com- 
pleted each page before another was begun). 

Only one of the three alternative meanings 
could be checked for each construct. Ss were 
told that there were no right or wrong an- 
swers, and that if none of the three meanings 
fitted exactly they were to check the one 
coming closest to what the construct sug- 
gested. Questions and complaints were occa- 
sionally raised, but on the whole, test ad- 
ministration ran smoothly. Only two of the 
original group of MPs had to be dropped 
from the sample because of a refusal to co- 
operate. 


Results 


The first question to consider is—are there 
forced 


constructs and meanings? 


associations between 
Table 1 contains 
the valances and meanings most frequently 
associated to the constructs by the total sam- 
ple, evaluated with chi square (Edwards, 
1957). The percentage of all Ss (N = 160) 
choosing a valence and meaning is listed. 
Note that in some cases, although statistical 
significance exists (.05 level or greater), no 
clear-cut choice between meanings associated 
to a given construct is possible. 


consistencies in 
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Table 1 


Percentage of Choices: Valences and Primary Associations (V = 160) 


Valence 


% of N 


Construct Value Choosing 


Boots 0s. 74** 


Smoke 
Bear 
Mask 


Fur 
Fire 


Clouds 
Recks 
Hair 

Bat 

Island 
Mountains 


Note N.S. means “‘not significant.” 
* Significant between .05 and .01 levels 
** Significant at or beyond .01 level 


A reasonable second question revolves about 
the influence of valences upon associations. 
Did Ss viewing a construct as either positive 
(i.e., roughly equivalent with “liking”) or 
negative significantly alter the pattern of as- 
sociations noted when this distinction was not 
taken into consideration? Table 2 contains a 
valence breakdown, giving the number of Ss 
(n) who assigned either positive or negative 
to a construct and, of these, the percentage 
who selected the listed meaning in their 
forced associations. Note that the meanings 
most frequently associated to a construct are 
remarkably similar in both tables. Moreover, 
the percentage of Ss choosing a given mean- 
ing remains roughly equivalent. Interesting 
differences can be pointed to. For example, 
the construct Bat no longer retains its asso- 
ciation with Ambition when we consider the 
choices of only those Ss (28) who checked 
Bat as having a positive valence. A finding of 
this sort suggests that, at least in some cases, 
significant differences resulted because Ss were 
choosing the alternative meaning least dif- 
ferent than a construct, and did this with 
consistency. 


Page 2 Primary 
Association 


Page 3 Primary 
Association 


“of N 
Choosing 


% of N 


Mng Choosing 


Mng. 


Sec. 61** Anger 36) 


Depr. 
N.S Fear 
N.S Fear 
Fear 
Depr 
Depr 
Fear 


Depr 
N.S 

Depr 
Fear 
Depr 
Fear 
Depr 


The next point to consider concerns any 
group differences that might be expected on 
the basis of group classification. Do MPs have 
‘private meanings” differing from the other 
groups which might account for some of the 
variance noted in the tables? It was decided 
that only the broad distinctions of MP-nor- 
mal and male-female would be of immediate 
significance in the present investigation. Finer 
breakdowns would not be necessary unless 
significantly different trends were noted in the 
larger comparisons. Consequently, an impulse 
to contrast each group was purposely checked. 

Comparisons of the specified subsamples 
revealed no major differences with regard to 
predominant valence or with regard to the 
most heavily favored associations. Only 12 of 
the 72 chi squares computed reached signifi- 
cance (.05 level or greater), and of these only 
three displayed a difference in predominant 
valence or primary association (i.e., meaning 
chosen most frequently by either of the groups 
under comparison). 

For example, a typical finding (.01 level) 
was that although MPs (63%) and normals 
(62%) both associated Security to Rocks, 
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normals 
associate 


(32%) were more likely to also 
Ambition, whereas MPs (22%) 
tended to choose Love as a secondary associa- 
tion to Rocks. An example of a relative va- 
lence difference (.01 level) is that women 
(94%) are more likely to assign a positive 
valence to Fur than men (80%). Such find- 
ings, although certainly suggestive, cannot be 
emphasized because of the gross nature of 
the polling. 

The three comparisons having differences 
in valence choice or primary association are 
as follows: MPs (76%) are more likely to at- 
tach a negative valence to Fire than normals 
(48%) (.01 level). MPs (52°) tend to as- 
sociate Fear to Mountain most frequently, 
whereas normals (43%) associate Depression 
most frequently (.02 level). Women (62%) 
associate Love to Clouds, whereas men (49% ) 
are more likely to associate Ambition (.01 
level). 

A final consideration of interest in a sur- 
vey of this sort concerns the frequency of 


meaning categories chosen, disregarding the 
particular construct to which they were to be 
associated. Do the experimental groups make 


equal use of all meanings, or do some favor a 
particular category? Additional tests of sig- 
nificance again revealed a fair degree of con- 
sistency concerning primary associations, yet 
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some differences were noted. Male MPs chose 
Security (42%) and Love (34%) most fre- 
quently, whereas normal men selected Am 
bition (389%) and Security (37%) most fre- 
quently (.01 level). No comparable differ- 
ences were found when contrasting female 
MPs and normal females. However, sex dii 
ferences did exist. Women, as a 
Anger (18%) and Fear (47%) as extremes, 
whereas the men had less of a contrast be 
tween extremes with Anger (24%) and De 
pression (39%) as their low and high choices 
(.01 level). 


group, chose 


Discussion 


Leaving aside the issue of which particular 
meaning was associated with which particular 
construct, it seems fairly apparent that the 
major prediction of this investigation has 
received some verification. A consistency in 
forced association was noted across such lines 
as sex and mental health. Only three actual 
reversals in valence or primary association 
were noted. The slight inconsistencies be 
tween groups probably can be attributed to 
chance, although they are explicable when 
taking the groups into consideration. That is, 
women assigning a positive valence to Fur 
more frequently than men is understandable, 
considering the female status implications of 


Table 2 


Percentage of Choices: 


Positive Valence ( varies) 


Construct n Mng 


oer 
NUS 
NOS 
Amb 


Sec 


Boots 118 
Smoke 74 
Bear 67 
Mas} 48 


138 

70 
133 
112 
132 

22 
138 
146 


Fur 
Fire 
Clouds 
Roc ks 
Hair 

Bat 

Island 
Mountains 


Note N.S. means “not significant.” 
* Significant between .05 and .01 level 


** Significant at or beyond .01 leve 


Valences Breakdown (N = 


Choosing 


6 


160) 


Negative Valence (m varies 


of n 


Mng 


. Depr 
Fear 
Fear 
Fear 
De} 
Depr 
Fear 
Dept 
NS 
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a fur coat. Or, the fact that male MPs use 
Love more frequently than normal males 
(who use Ambition) when associating to the 
constructs makes sense in light of the former’s 
circumstances as hospitalized patients in need 
of care, etc. 

On a more theoretical plane, the results are 
in line with a sociopsychological approach. If 
constructs have a symbolic function then, un- 
less the clinician has truly extensive time in 
which to interview an individual to determine 
unique aspects of his symbolic activity, it 
must be assumed that this is a socially learned 
function, carried by cultural influences. The 
test or dream interpreter, as a member of this 
culture and by assuming the proper set to ob- 
serve, can identify these “symbols” in the 
fashion outlined in the introduction of this 
paper. By studying the tables the reader can 
assess for himself the accuracy with which 
his particular bias of symbolic interpretation 
meets the findings. Suffice to say that many 
common interpretations hold. This study can 
say nothing about the actual meaning of a 
construct. 


Summary 


The present research approaches the prob- 
lem of construct symbolism with a sociopsy- 


chological rationale. It is assumed that, if con- 
structs chosen from the Rorschach have the 
content meanings often ascribed to them by 
clinicians, then a polling of assorted groups 
should reflect such consistencies in forced as- 
sociations. Assorted groups of Ss polled are 
found to reflect certain consistencies in their 
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forced associations, cutting across such lines 
as sex and mental health. The major predic- 
tion is considered verified. 


Received October 13, 1958. 
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A NOTE ON LEVIN’S REPORT OF FINDINGS WITH 
PHENOBARBITAL, PROMETHAZINE, CHLOR- 
PROMAZINE, AND PLACEBO 


LLOYD K. SINES 


University of Minnesota 


In view of the many reports in the litera- 
ture and other sources of information relative 
to designing and conducting adequate studies 
in the area of psychopharmacology (Kline, 
1959; Rashkis & Smarr, 1958), there seems 
little excuse for the persistence of many of 
the major tactical and other errors apparent 
in this area of important and copious research. 
A recent study by Levin (1959), reported in 
this Journal, exemplifies several such unneces- 
sary methodological errors. In view of the fact 
that reports in the professional literature of 
psychopharmacological studies will undoubt- 
edly increase in the immediate future, con- 
tinuing efforts must be made to improve re- 
search design and methodology and, also, to 
maintain high standards for the publication 
of such reports. 

I should like to note two types of errors 
exemplified in Levin's study and, also, to 
comment on what appears to be an error in 
the author’s interpretation of his own data. 


Methodological Shortcomings 


1. The use of an arbitrary and uniform 
dosage of each medication for all patients in 
the study constitutes the major methodologi- 
cal error in the study. This practice cannot 
be defended in view of the knowledge of indi- 
vidual differences among persons in their re- 
sponsiveness to various chemical compounds. 
In this connection, it has been reported by 
Winkelman (1957) that a ratio of 35 to 1 
has been found in the individual patient toler- 
ance of, or reaction to, ataractic drugs. At 
the present time, the use of an experienced 
physician’s clinical judgment in the determi- 
nation and individualization of drug dosage 
on a patient-to-patient basis is much preferred 
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over the practice of employing a single uni- 
form dosage of the therapeutic agents used. 

2. The possibility of bias in the behavioral 
ratings cannot be ruled out in view of the 
fact that the raters may have been aware of 
differences among the groups in the prepara- 
tions which the patients received. That is, 
though it is stated that chlorpromazine and its 
placebo were “similar” in appearance, pre- 
sumably the phenobarbital and promethazine 
preparations were distinguishable in appear 
ance (and possibly in other ways) from the 
other two agents. Further, it is common 
knowledge that even under the so-called 
“double-blind” regime, ward personnel (and 
raters) are able to predict with high accuracy 
the actual group identification of patients re- 
ceiving alternative therapeutic agents (Hall 
& Dunlap, 1955). It is, of course, not inevi- 
table that such information leads to biased 
judgments of change among patients, although 
it is necessarily a limiting condition in the 
experimental design. The use of standard 
liquid or parenteral modes of drug adminis- 
tration is a desirable (though often impracti 
cal) procedure aimed toward minimizing pos 
sibilities for the identification of the specific 
group status of patients on such a priori 
grounds. 

3. An over-all attrition rate of 23% among 
patients originally included in the study ap 
pears somewhat high and suggests the possi 
bility that even if the original patient sample 
were a random or representative one with 
respect to the parent population, the final 
analyses (and therefore generalizations) may 
well have been based upon a unique or selec- 
tive sample of subjects. 
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Statistical W caknesses 


1. Data (and statistical analyses) support- 
ing the conclusion that fewer symptoms were 
observed in all groups at the end of the 
follow-up period than at the beginning of the 
study are not offered. The graph presented 
by the author, on the other hand, suggests 
that the phenobarbital and placebo groups 
were about the same (in terms of Wittenborn 
scores) at the end of the follow-up period as 
at the beginning of the study. The latter, of 
course, would imply that a general or over- 
all decrease in symptomatology at the end of 
that period could be accounted for by im- 
provements among chlorpromazine and pro- 
methazine group patients. 

2. Similarly, data are not provided which 
presumably provided the basis for the con- 
clusion that there was “a significant increase 
in symptoms” among the groups from the end 
of the treatment period to the end of the 
follow-up period. On the contrary, the graph 
provided suggests only minimal regression to 
initial level among patients in the chlorproma- 
zine group. It is conceivable, therefore, that 
the therapeutic improvement in the latter 
group was maintained during the follow-up 
period. 

3. In view of the (significant?) differences 
on the Wittenborn scale among the groups at 
the beginning of the study, the use of the 
analysis of covariance at the end of the treat- 
ment period and at the end of the follow-up 
period would have been preferable to the use 
of the analysis of variance. 

4. The investigators failed to pursue the 
data sufficiently by neglecting the application 
of appropriate small sample statistics (¢ tests 
between related means) to the data. Such 
tests might well have revealed that, while all 
drug groups improved during the treatment 
period, placebo group patients remained es- 
sentially unimproved and, further, that all 
groups except chlorpromazine returned to 
their previous symptomatic levels after a two- 
month no-medication period. 

Finally, a gross error in the interpretation 
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of the data presented appears to have been 
made, i.e., differential effects among the ex- 
perimental groups were demonstrated at the 
end of the treatment period (F = 21.26, P < 
01), and may well have persisted through the 
follow-up phase (an analysis of variance was 
not carried out on the latter data), indicating 
that in addition to whatever general improve- 
ments were common to the groups, there was 
further differential change due to the several 
treatments. The author’s conclusion that “all 
groups showed improvement of similar de- 
grees”’ is, therefore, in direct contradiction to 
the data offered. 

While this writer agrees with Levin‘s state- 
ment that “in designing drug research, great 
care must be taken to control incidental vari- 
ables [and] caution must be used in inter- 
preting uncontrolled clinical studies in which 
treatment methods are evaluated,” I fail to 
see how the data presented by Levin consti- 
tute an empirical corroboration of such cliché- 
like admonitions. 

I find such methodological bungling in psy- 
chopharmacological studies at this late date 
exasperating and, further, such tactical and 
interpretive errors inexcusable in articles pub- 
lished in the psychological literature. 
Received May 1, 1959 
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A REPLY TO SINES’ NOTE ON A COMPARISON OF THE 
EFFECTS OF PHENOBARBITAL, PROMETHAZINE, 
CHLORPROMAZINE, AND PLACEBO UPON 
MENTAL HOSPITAL PATIENTS 


MONROE L 


LEVIN 


Oneida Consolidated Schools, Oneida, New York 


Sines’ recent note (Sines, 1959) criticizes 
aspects of the methodology and statistical 
methods used in the writer’s recently pub- 
lished research (Levin, 1959). In evaluating 
the original paper in the light of Sines’ criti- 
cisms and Sines’ comments, attention should 
be paid to the initial paragraphs of the study; 
the major purpose of the study was to deter- 
mine whether, as had been anticipated in ad- 
vance, variables other than drugs were of 
sufficient strength to produce measurable and 
observable changes in the behavior of mental 
hospital patients, and whether the behavioral 
changes would be in the directions antici- 
pated. 


Methodology 


Sines objected to use of arbitrary and uni- 
form dosage in each of the subject groups on 
the grounds that there are wide fluctuations in 
individual responses to chemical compounds. 
He further reports that others have found a 
wide range of patient responses to ataractic 
drugs, concluding that individualized drug 
dosage is to be preferred to uniform dosage 
of therapeutic agents. 

The study in question was conceived in 
1955, at a time when ataractic drugs were in 
the process of becoming available in the state 
hospital at which the study was conducted. 
At that time, uniform dosages were often 
used, both because of the relative newness of 
the ataractic drugs in state mental hospitals, 
and because of the large numbers of patients 
for whom individual physicians were respon- 
sible. 

Sines’ criticism did not take into account 
the fact that if the ataractic used in the study 


does, as was then being reported, produce 
dramatic and highly positive results in an al- 
most consistent way, use of the agent in even 
the limited fashion reported should have pro- 
duced positive and differential results similar 
to ones reported earlier by others. It was rec- 
ognized that uniform dosage might attenuate 
the results, but it was not believed that this 
would be sufficient to completely attenuate 
the effects of the drugs which were used. 

Use of physician-directed, modifiable dos- 
ages of drugs and placebo was considered but 
necessarily rejected as a possibility because 
of the innumerable problems it would have 
presented both in the administrative struc- 
ture of the hospital and in the design of the 
study. 

The dosages of ataractic and antihistamine 
were suggested by the pharmaceutical con- 
cerns involved as ones which were then con- 
sidered to be optimal under the conditions of 
the research. 

Sines states that bias of raters may have 
been a major factor in slanting the findings, 
because of the ability of ward personnel to 
identify the various drugs and placebo used 
Attention is called to the methods used to 
minimize bias. If any of the ward personnel 
were biased because of their ability to dis 
tinguish between the various tablets, the 
identical (not “similar” as Sines states) ap 
pearance of the placebo and chlorpromazine 
might well have exaggerated, rather than 
minimized differences between the groups, 
and might well have favored a greater differ- 
ence than was obtained between the ataractic 
placebo, and other drugs. 

Since patients were distributed in almost 
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random fashion throughout the hospital, it 
was assumed that any bias which arose would 
be inconsistent and sufficiently attenuated to 
become negligible. 

Sines’ third criticism is of the attrition 
rate (23%) reported. He indicated that the 
rate of attrition may have been a reflection 
of sampling bias. Every patient in a 2,000 
bed hospital who met the selection criteria 
was initially included in the study. Reference 
is made to the original paper in which it was 
indicated that most subjects were lost for 
administrative reasons. “Administrative rea- 
sons” included such factors as failure of 
ward personnel to administer drugs, failure 
to rate patient behavior at appropriate in- 
tervals, and failure to administer appropriate 
drugs. The vast majority of subjects were 
lost because of administrative reasons, as was 
originally reported. The attrition rate, there- 
fore, was Clearly not due to sampling factors 
but, rather, to factors which have no demon- 
strable relation to patient behavior or group- 
ing in the study. 

Since the hospital at which the study was 
conducted serves an area representing one 
quarter of the geographic area of the state in 
which it is located, and since almost all eli- 
gible patients participated, it is doubtful that 


the sample represented anything other than 
that which it was intended to. 


L. Levin 


Statistical Method 


In view of the purpose of the study, the 
data were examined to determine if there 
were differential shifts over time in symp- 
toms, as measured by means of the Witten- 
born scales. Differential shifts should produce 
significant interaction between treatment and 
time. The analysis of variance (see Table 1, 
Levin, 1959) clearly showed that there was 
no significant interaction. 

Sines appears to have confused the mean- 
ing of the analysis of variance reported. The 
significant F reported in Table 1 does not in- 
dicate, as he says, that there were differential 
effects among the experimental groups at the 
end of treatment period; the F merely indi- 
cates that significant shifts occurred in the 
symptom scores of all groups during the 
course of the study. 
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ABSENCE OF ACQUIESCENCE RESPONSE SET IN 
THE TAYLOR MANIFEST ANXIETY SCALE’ 


LOREN J. CHAPMAN 


University of Chicago 


anD DONALD T. CAMPBELL 


Northwestern University 


When attitude and personality scales of the 
agree-disagree or true-false format contain un- 
equal numbers of items worded in the direc- 
tion of the two ends of the personality dimen- 
sion, the response to the content of the items 
may be confounded with general tendencies to 
agree or disagree (Cronbach, 1946). A num- 
ber of recent studies have investigated such 
acquiescence response set in the F scale from 
The Authoritarian Personality study (Adorno, 
Frenkel-Brunswik, Levinson, & Sanford, 1950) 
by means of correlating scales or reversed 
items with scales of original items. A recent 
review (Chapman & Bock, 1958) has shown 
that all of these studies found evidence of 
substantial acquiescence variance. 

Another scale which might be suspected of 
acquiescence bias is the Taylor Manifest Anx- 
iety Scale (Taylor, 1953), in which 39 of the 
50 items are worded so that agreement with 
the item contributes toward a high MA score. 

A reversed MAS was constructed by simple 
literal reversals of meaning, e.g., for “I am 
often sick to my stomach,” the reversal was 
“T am seldom sick to my stomach.” 

If there were acquiescence variance in the 
MAS, it would be expected to influence the 
correlation of the MAS with other measures 
which themselves are correlated with acquies- 
cent tendencies. To study such possible ef- 
fects, each S was also given original and re- 
versed F and E scales (Adorno et al., 1950). 

Two versions of the test were prepared, 
each having half of the items from each scale 
in positive form and the other half negative, 


1 This study was supported in part by the Social 
Science Research Committee of the University of 
Chicago and in part by the Graduate School, North- 
western University. 


with each item appearing in its positive form 
in one scale and its negative form in the 
other. Each S received the two forms one 
week apart. Since the E scale measures atti- 
tudes toward minority groups, Jewish, Negro, 
and Oriental Ss were not included. The final 
experimental group was 184 college students. 

As an additional intelligence measure for 
86 of the 184 Ss, OSPE percentile ranks were 
obtained from the university admission’s rec- 
ords, and for another 70 a sum of verbal plus 
mathematical scores on the college boards was 
employed. 


Results 


The Kuder-Richardson reliabilities of the 
positive and negative forms of the MAS were 
.87 and .85, respectively, and the correlation 
between positive and negative halves was .84. 
The correlation is seen to be almost as high 
as the reliabilities of the scales, a finding 
which is in striking contrast to the results of 
similar studies on the F scale, including that 
of Christie, Havel, and Seidenberg (1958). 
We conclude that the items of the MAS, un- 
like those of the F scale, are subject to little 
or no acquiescence bias. This is probably at- 
tributable to the quite specific and personal 
reference of the items. 

Table 1 shows the intercorrelations of the 
two MA scales with the other measures. Note 
that both the positive and negative forms of 
MAS show correlations with the original F 
scale (F,..), significant beyond the .01 level 
of .19. No significant correlation with intelli- 
gence or with the E scale is found. 

Note that neither MA,,, nor MAneg repre- 
sents the original MAS. When the MAS is re 
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constituted from the item pool of the present 
study, it has a reliability of .84. The correla- 
tion between MAS and F,,,, is .29 
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BRIEF REPORTS 


VALIDATION OF DOPPELT’S WAIS SHORT FORM 


WITH A CLINICA 


HUGH CLAYTON 


Colorado Sta 


In 1956, Jerome Dopplet published his find- 
ings relative to a brief form of the WAIS. The 
brief form consisted of the following four sub- 
tests: Arithmetic, Vocabulary, Block Design, and 
Picture Arrangement. The sum of the four scaled 
scores is multiplied by 2.5 and increased a 
constant which varies with the subject’s age. The 
constants, for their respective age ranges, are as 
follows: 16-34: 10 44 45-54 55-64: 
7, 65-74: 5, 75+: 4. The final sum is equivalent 
to the Full Scale “Scaled Score.” The IQ is then 
found in the IQ table sub- 
ject’s age 

Using the above formula, Doppelt computed 
the Brief Scale Scores for Wechsler’s standardi- 
zation population, broke them up into 
ind then ran Pearson correlations between 
Brief Scale IQ and the Full Scale IQ for eax 
the age groups. Pearson r’s ranged from .9 
.06. 


by 
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Wechsler’s standardization population consisted 
of “normal” subjects. The question raised here is 
that of validity with a clinical population. There 
would be a real possibility that the correlations 
might not run nearly so high with a population 
of mentally disturbed patients whose intellectual 
functioning is erratic in varying degrees. Another 
question to be answered concerns the possibility 
that the Brief Scale IQ might be more valid with 
certain < 
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THE VALIDITY OF A SELF REPORT MEASURE OF ANXIETY 
AS A FUNCTION OF THE TIME INTERVAL COVERED 
BY THE INSTRUCTIONS ' 


BARCLAY MARTIN 


University of Wisconsin 


The instructions accompanying most paper- 
and-pencil scales of personality are vague with 
respect to the time interval for which the person 
is rating himself, although the usual implication 
is that the person is answering the items in terms 
of how he has generally been during his life. For 
many traits, and especially for a trait such as 
anxiety, there is little doubt but that most peo- 
ple vary considerably from day to day or month 
to month in the extent to which they possess the 
trait. It is not surprising therefore to find that a 
scale such as the Taylor Manifest Anxiety Scale, 
taken in terms of how anxious one generally is, 
frequently fails to predict anxiety as assessed by 
other methods in particular situations. The pur- 
pose of the present research was to experimen- 
tally explore the effect of different instructional 
time periods on the relationship 
measure of anxiety to a criterion 
anxiety in a specific stress situatior 

Anxiety was psychometrically measured by the 
Feeling Inventory, a specially constructed forced- 
choice ‘inventory, consisting of 23 t 
jectives selected on the basis of a previous in- 
ternal consistency item analysis, with adjectives 
such as “tense” and “relaxed” making up the re- 
spective ends of the scale 

Ss were first seen in an individual stress situa- 
tion in which they were confronted in clos¢ 


1 self-report 
ussessment of 


riple ts of ad- 


prox- 


1An extended report of this study may be ob- 
tained without charge from Barclay Martin, Psy 
chology Department, University of Wisconsin, Madi 
son, Wis., or for a fee from the American Docu 
mentation Institute. Order Document No. 6017, re 
mitting $1.25 for microfilm or $1 for pk 


ytocopies 


imity by two experimenters who rather obviously 
made continuous ratings on the Ss throughout the 
15-minute session. Ss were told that they were 
going to be asked to do things that would allow 
the experimenters to understand what they were 
really like as persons, what their inner wishes and 
fantasies were, etc. Ss were then asked to re- 
spond to one Rorschach card, one TAT card, and 
to free associate for a short period. The two ex- 
perimenters independently tallied nervous move- 
ments throughout the session and made a global 
rating of anxiety on a seven-point scale at the 
end of the session. Continuous GSR recordings 
were also obtained 

At the end of the session, Ss were randomly 
assigned to one of three groups and given instruc- 
tions to take the Feeling Inventory in terms of 
(a) how they generally feel, (b) how they have 
felt during the last month, and (c) how they just 
felt in the 15-minute session 

Only for the “1 group 
was there a significant tendency for Feeling In 
ventory scores to correlate with any of the cri 
terion anxiety measures in the expected direction 
The highest correlation for this group was .44 for 
base level skin conductance. For the “last month 
instruction group, there were unexpected negative 
correlations between rated anxiety and nervous 
movements 
Manifest 


}-minute 


instruction 


and the Feeling Inventory 
Anxiety 
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scores obtained in an earlier 


group session did not correlate significantly with 


any of the criterion measures 
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MANIFEST ANXIETY AND PERFORMANCE ON PROBLEM 
SOLVING TASKS ' 


DURGANAND SINHA 


Indian Institute of Technology, Kharagpur, India 
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In this study, the performance of 20 high scor- 
ing and 20 the Taylor 
Scale of Manifest Anxiety was compared on a 
set of six problem solving tasks which were novel 
and of moderate difficulty. Three of the tasks 
were sensorimotor in character (Substitution Test 
Test, and Katona Matchstick 
Problems), and the other three involved largely 
mental activities (Line-Pursuit Test, Mixed Sen- 
and Number Series). Within the 
framework of Hull’s behavior theory, it 
pected that tasks being involving a 
hierarchy of competing the higher 
drive level, as measured by high scores on the 
Taylor scale, would lead to impairment of pet 


low scoring subjects on 


Design Sorting 


tences general 


was eCx- 


complex 


responses 


formance 

Though HA Ss had 
errors on each task, the performance of the two 
groups did not differ significantly in respect of 
errors. However, on all the Katona problems 
more LA than HA Ss had solved them, y 
significant in all instances (Problem 1: 
p= .01; Problem 2: ,? 
lem 3: y 3.95, p= .05) 

But the HA group consistently required more 
time for solving both the sensorimotor and men 
tal tasks, ¢ ratios being significant in all 
but one (Sorting: t = 4.0, p= .01; Substitution 
t = 3.4, p= .01; Katona: ¢ = 3.4, p 1; Line 


r 

1 An extended report of this study may be ob 
tained without charge from Durganand Sinha, In 
dian Institute of Technology, Kharagpur, India, or 
for a fee from the American Documentation Insti 
tute. Order Document No. 6018, remitting $1.25 for 
microfilm or $1.25 for photocopies 
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The inferiority of HA Ss can be explained in 
terms of difficulty of integration of responses and 


he selection of the correct one out of many pos 


sible responses. Increased drive level multiplied 
indiscriminately the strength of all competing re 
Irrelevant and 


and had to be corrected 


sponses wrong movements were 


made This interference 
showed itself in slightly higher error score and 
significantly longer time required for the solution 
Performance was of less integrative character due 
{ (Malmo & 
or general “disorganization of be 

1948). The effect on 
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seems to become a significant variable in test per 


formance. It is part cularly so on speed tests 


Received April 6, 1 
Brief Report 
REFERENCES 
Davis, D. R. Increase in strength of 
drive as a cause of 
Psychol., 1948, 
Matmo, R. B., 
terference 
on rote 
J. exp. Psychol., 


disorganization 


1, 22-28 


Vuart 


& Amset, A. Anxiety-produced in 
in serial rote learning with observation 
learning after partial frontal lobectomy 


1948, 38, 440-454 





Journal of Consulting Psychology 
Vol. 23, No. 5, 1959 


PSYCHO 
TEST 





Caligor, Leopold. The Eight Card Redrawing Test 
(8CRT). Kit containing test instructions, blanks, 
score sheets, and scoring masks, pp. 148. 8CRT, 
P.O. Box No. 31, Gracie Station, New York 28, 
New York. Manual: A New Approach to Figure 
Drawing. Springfield, Ill.: Charles C Thomas, 
1957. 

This technique represents a modification and elabo 
ration to the Draw-A-Person test. The subject is 
requested to “Make a picture of a whole person.” 
When the drawing is completed a sheet of trans 
parent onionskin paper is placed over it and the 
subject is asked to make a second drawing on this 
onionskin. His exact instructions are as follows 
“You may do anything you like with this picture 
of a person. You may add to it, take away from it, 
change it or leave it alone. Only again, make a pi 
ture of a whole person.” When the second drawing 
is completed, the first drawing is then concealed and 
an onionskin sheet placed over the second drawing 
A third drawing is requested from the subject and 
he receives the same instructions as he did for the 
second drawing. This process is continued 
total of eight drawings have been obtained 

Caligor devised this technique because felt that 
“The single drawing left so many questions unan- 
swered.” He expected that multiple drawings would 
provide a much richer yield about the sub- 
ject, particularly in terms of the continuities and 
discontinuities that were apparent in the total series 
The manual for the 8CRT contains an elaborate 
scoring system for evaluating well over twenty dif- 
ferent variables. These variables 
structural aspects of the 
principal scoring 
order: 


until a 


data 


relate primarily to 
Some of the 
categories art { the 


drawings 


following 


Degree to which complete figure drawn 

Height of figure 

Placement of figure 
left) 

Stance of figure 

Ratio of head size to body size (height & width) 

Sex of figure 

Transparencies 

Erasures 


on page 


IOGICAL 


os 


Relation of body walls and body areas of the fig 
ure to the preceeding figure 

Degree of movement of figure 

Manner in which the body is filled in (e.g., 
of detail and shading) 

Line quality (tremble, pressure, thickness) 

Symmetry of placement of body parts 

Clothing 

Omission of body parts 


Each category is interpreted in terms of the degree 
to which it shifts and varies over the entire series of 
eight drawings. Caligor offers brief formulations con 
what he considers to be the meanings of 
the various categories. Illustratively, in referring to 
“placement on the page” he “Placements other 
than center reveal limitation in the ability to ade 
quately cope with the environment. They indicate 
inadequate cognizance or handling of stimulation and 
attempted avoidance of the full impact of the envi 
ronment.” Further, in his description of the 
cance ol 


amount 


cerning 


says, 


signifi- 
“erasures” he sta “Most people erase on 
absence of erasures 
limitation of corrective and critical faculties, 
lowered testing and frequently a 
labile reaction to stimulation. Excessive erasures re- 
flect hypercriticality, blocking or conflict and over- 
controlled resistance to stimulation.” In order to 
clarify how his scoring categories may be applied 
he presents detailed analyses in his manual of the 
drawings of three different subjects. These analyses 
ire of a broadly impressionistic character and ap- 
parently involve the intuitive discovery of a mean 
ingful pattern in a large 
Caligor does refer to three 
dertaken for validation 
tions of these studies are 
clear that he did not attempt to test directly the 
validity of the meanings assigned to the various 
scoring categories. One can see that his formulations 
regarding the significance of given variables are 
based almost entirely on his clinical experiences and 
that one must accept them on faith. The 8CRT is 
presently a collection of hunches that have been 
formalized by Caligor into a scoring system. Some of 
his hunches are novel and interesting and may prove 


two or more drawings. The 
reflects 


possible reality 


array of scores 

studies which were un- 
purposes, but his descrip 
extremely brief and it is 
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eventually to have an enriching effect upon figure 
drawing analysis. Thus one is particularly impressed 
with the novelty and potential importance of his 
emphasis on such factors as spatial directionality 
(e.g., up vs. down and right vs. left), symmetry, 
and mode of maintaining continuity from one draw 
ing to another. 

Overall, though, it must be said that his scoring 
categories are not bound together by a unifying con 
cept or viewpoint. They seem, on the contrary, to 
consist of heterogenous “signs” which were assem 
bled in an arbitrary fashion. One must also question 
the value of Caligor’s mode of defining many of the 
scoring categories. His definitions are often very 
vague and hazy. Illustratively, he refers to various 
signs as indicating “anxiety,” “conflict,” “immatu 
rity,” “lowered ability to orient oneself in the envi- 
ronment,” “ability to use inner resources,” “aware- 
ness of objects or other persons in the environment.” 
How much more would one know about a given 
subject for having acquired such vague bits of in 
formation about him? The 8CRT does indeed pre 
sent some new ideas about figure drawing analysis 
but it lacks the rationale or validation to be consid 
ered a formal test—Seymour Fisher 
Bennett, George K., Seashore, Harold G., & Wesmat 

Alexander G. Differential Aptitude Tests. Manual 

(3rd ed.), pp. 94. New York: Psychological Corp 

1959 

The new manual follows very closely the format 
of the previous (1952) edition (see J. consult. Psy 
chol., 1953, 17, 78). The main change is that a 16 
page research supplement has been added. This sup 
plement offers data designed to validate the com 
bination of the Verbal Reasoning and Numerical 
ibility scores into an index of scholastic aptitude 
and, in addition, six other studies selected for their 
special interest to users of the tests. The bibliography 
has increased from 27 to 105 items. The manual re 
mains a model of its category —E. S. B 


Carter, H. D. California Study 
Grades 7 to 13. Test booklet (no time limit) 
manual, 16 pp., sheets, 
Los Angeles: California Test 
This 150-item inventory vields 

tudes toward School, Mechanics of 

and System, Verification. The last 
items with “the thirty most popular responses” is 
supposed to validate the other scales. A good deal of 
work seems to have gone into the development of 
this instrument; its reliability seems adequate, and 
there is encouraging evidence of its validity for us 
in educational diagnosis with high school students 
Little evidence is offered to support the supposed 
function of the Verification score—E. S. B 
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respects from the original. The directions for ad- 
ministration now include a section on machine scor 
ing; a general adult sample now provides an addi 
tional norm group; the bibliography has been 
panded from 9 to 82 references, most of the 
references representing studies in which the 

figured. The general adult sample is less than 
quate for normative purposes because it consists of 
a nationwide sample of male and female household 
heads who are members of a consumer purchase 
panel used for market surveys. It is not 
this means with regard to its possible biases with 
respect to age, education, and class status. The mean 
scores of this sample vary from those in the other 
norm group, college students. But there is no effort 
to the significance of these differences for 
the validity of the test. In fact, it is astonishing that 
no effort was made to summarize the of 
the large body of studies in the bibliography to the 
test’s validity and reliability. The of the 
first manual pointed to its deficiencies with regard 
to validating evidence the 
and publisher bothered to offer a n which r 
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one item in each triad that is most clearly repre- 
sented by the blot or by some part of the blot. 
Choice of an item in a triad contributes to each of 
the four general categories of scoring much as in the 
standard situation. 

Standard scores are provided for 15 variables based 
upon a normative sample which tries to make up in 
numbers what it lacks in definitiveness. The norma- 
tive sample is an amalgam of 967 Brigham Young 
University students and 7094 selected occupational 
groups. The reported test-retest reliabilities (one- 
week interval) are acceptable but not great, rang- 
ing between .62 and .90. The 15 factors are either 
taken singly or combined in various ways to derive 
26 different attributes. 

The validity of the test seems to rest mainly on 
two bases: first, its supposed relationship to Ror- 
schach phenomena; second, two empirical studies. 
Without considerable empirical evidence, there is 
great room for doubt that responses to the SORT 
are mainly tapping perceptual phenomena analogous 
to its prototype. The method of test administration 
contains no procedures that ensure that the subject 
has seen the percept that he chooses nor for that 
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matter that he has even looked at the blot. Response 
sets of various sorts, particularly social desirability 
stereotypes, are probably greatly enhanced under 
such circumstances. The manual gives a brief. report 
of two unpublished validity studies. In one, the 
SORT increased the correlation of high school grade 
point average with first year grades from .59 to .68, 
using the best 2 of 15 scores. No cross-validation 
data are offered. In the other, the SORT variables 
are correlated with supervisors’ ratings in 29 occu- 
pational groups. A suggestive but not impresive array 
of correlations was obtained. Again no cross-valida- 
tion. 

This all sums up to the fact that we have here an 
interesting new experiment in adapting Rorschach 
testing techniques to the need for large scale testing 
and objective scoring and interpretation. But it is 
still experimental, and every effort should be made 
to warn against adopting it for operational use. I 
do not believe the “Preliminary Edition” in the 
title or the caution section in the manual, largely 
irrelevant to this issue, represent sufficient effort to 
emphasize the fact that this instrument is not ready 
for operational use—-E. S. B 
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